Chapter 10. Tika and the Lucene search stack


This chapter covers


We’re going to take a break from our in-depth tour of the Tika framework. By now, those topics should be second nature to you. But you may not be so comfortable with phrases like Mahout, or Droids, or (eep!) Open Relevance.

Though these terms might sound foreign, they’re common terminology to those familiar with the Apache Lucene[1] family of search-related applications. Lucene is an Apache Top Level Project, or TLP, originally home to a number of search-related software products that themselves have grown to TLP-level status, including Tika.

1 The name Lucene was Doug Cutting’s wife’s middle name, ...

Get Tika in Action now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.