Chapter 2. Indexing and Searching Data

In this chapter, we will cover the following recipes:

  • Indexing data with Apache Lucene
  • Searching indexed data with Apache Lucene

Introduction

In this chapter, you will learn two very important recipes. The first recipe demonstrates how you can index your data, and the second recipe, which is very closely connected to the first recipe, demonstrates how you can search through your indexed data.

For both indexing and searching, we will be using Apache Lucene. Apache Lucene is a free, opensource Java software library used heavily for information retrieval. It is supported by the Apache Software Foundation and is released under the Apache Software License.

Many different modern search platforms, such as Apache Solr ...

Get Java Data Science Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.