You are previewing Lucene 4 Cookbook.
O'Reilly logo
Lucene 4 Cookbook

Book Description

Over 70 hands-on recipes to quickly and effectively integrate Lucene into your search application

In Detail

Lucene 4 Cookbook is a practical guide that shows you how to build a scalable search engine for your application, from an internal documentation search to a wide-scale web implementation with millions of records. Starting with helping you to successfully install Apache Lucene, it will guide you through creating your first search application. Furthermore, the book walks you through analyzing your text and indexing your data to leverage the performance of your search application. As you progress through the chapters, you will learn to effectively search your indexes and successfully employ real-time searching.

The chapters start off with simple concepts and build up to complex solutions that should help you on your way to becoming a search engine expert.

What You Will Learn

  • Explore the best practices to make the most of your search application

  • Create and write documents in an index

  • Customize scoring and boosting in your application to influence search results

  • Expand Lucene's functionality, such as spatial searching and faceting with add-on modules

  • Load and initialize the library and build a search index of data

  • Understand trading between NRT latency and throughput

  • Optimize your search applications by employing features such as near real-time (NRT) search

  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. Lucene 4 Cookbook
      1. Table of Contents
      2. Lucene 4 Cookbook
      3. Credits
      4. About the Authors
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why Subscribe?
          2. Free Access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Sections
          1. Getting ready
          2. How to do it…
          3. How it works…
          4. There's more…
          5. See also
        5. Conventions
        6. Reader feedback
        7. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Introducing Lucene
        1. Introduction
          1. How Lucene works
          2. Why is Lucene so popular?
          3. Some Lucene implementations
        2. Installing Lucene
          1. How to do it...
          2. How it works…
        3. Setting up a simple Java Lucene project
          1. Getting ready
          2. How to do it...
          3. How it works...
        4. Obtaining an IndexWriter
          1. How to do it…
          2. How it works…
        5. Creating an analyzer
          1. Getting ready
          2. How to do it...
          3. How it works…
        6. Creating fields
          1. How to do it...
          2. How It Works
        7. Creating and writing documents to an index
          1. How to do it...
          2. How it works…
        8. Deleting documents
          1. How to do it...
          2. How it works…
        9. Obtaining an IndexSearcher
          1. How to do it...
          2. How it works…
        10. Creating queries with the Lucene QueryParser
          1. How to do it...
          2. How it works…
        11. Performing a search
          1. How to do it...
          2. How it works…
        12. Enumerating results
          1. How to do it...
          2. How it works…
      9. 2. Analyzing Your Text
        1. Introduction
        2. Obtaining a common analyzer
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more…
        3. Obtaining a TokenStream
          1. Getting ready
          2. How to do it...
          3. How it works…
        4. Obtaining TokenAttribute values
          1. Getting ready
          2. How to do it…
          3. How it works…
        5. Using PositionIncrementAttribute
          1. Getting ready
          2. How to do it...
          3. How it works…
        6. Using PerFieldAnalyzerWrapper
          1. Getting ready
          2. How to do it…
          3. How it works…
        7. Defining custom TokenFilters
          1. How to do it…
          2. How it works…
        8. Defining custom analyzers
          1. How to do it…
          2. How it works…
        9. Defining custom tokenizers
          1. How to do it…
          2. How it works…
        10. Defining custom attributes
          1. How to do it…
          2. How it works…
      10. 3. Indexing Your Data
        1. Introduction
        2. Obtaining an IndexWriter
          1. How to do it...
          2. How it works...
        3. Creating a StringField
          1. How to do it...
          2. How it works...
        4. Creating a TextField
          1. How to do it...
          2. How it works...
        5. Creating a numeric field
          1. How to do it...
          2. How it works...
        6. Creating a DocValue Field
          1. How to do it...
          2. How it works...
        7. Transactional commits and index versioning
          1. How to do it...
          2. How it works...
        8. Reusing field and document objects per thread
          1. How to do It...
          2. How it works...
        9. Delving into field norms
          1. How to do it...
          2. How it works...
        10. Changing similarity implementation used during indexing
          1. Getting ready
          2. How to do it…
      11. 4. Searching Your Indexes
        1. Introduction
        2. Obtaining IndexReaders
          1. How to do it...
          2. How it works...
        3. Un-inverting single-valued fields in memory with FieldCache
          1. How to do it...
          2. How it works...
        4. TermVectors
          1. How to do it...
          2. How it works...
        5. IndexSearcher
          1. How to do it...
          2. How it works...
        6. Constructing queries
          1. How to do it...
          2. How it works...
        7. Specifying sort logic
          1. How to do it...
          2. How it works...
        8. Forming a search result
          1. How to do it...
          2. How it works...
        9. Pagination
          1. How to do it...
          2. How it works...
        10. Using Collectors
          1. How to do it...
          2. How it works...
        11. Sorting with custom FieldComparator
          1. How to do it...
          2. How it works...
      12. 5. Near Real-time Searching
        1. Introduction
        2. Using the DirectoryReader to open index in Near Real-Time
          1. How to do it...
          2. How it works...
        3. Using the SearcherManager to refresh IndexSearcher
          1. How to do it...
          2. How it works...
        4. Generational indexing with TrackingIndexWriter
          1. How to do it…
          2. How it works...
        5. Maintaining search sessions with SearcherLifetimeManager
          1. How to do it…
          2. How it works…
        6. Performance tuning: latency and throughput
          1. How to do it…
          2. How it works…
      13. 6. Querying and Filtering Data
        1. Introduction
        2. Performing advanced filtering
          1. How to do it...
          2. How it works…
        3. Creating a custom filter
          1. How to do it...
        4. Searching with QueryParser
          1. How to do it..
            1. Wildcard search
            2. Term range search
            3. Autogenerated phrase query
            4. Date resolution
            5. Default operator
            6. Enable position increments
            7. Fuzzy query
            8. Lowercase expanded term
            9. Phrase slop
        5. TermQuery and TermRangeQuery
          1. How to do it..
        6. BooleanQuery
          1. How to do it..
          2. How it works…
        7. PrefixQuery and WildcardQuery
          1. How to do it...
          2. How it works…
        8. PhraseQuery and MultiPhraseQuery
          1. How to do it...
          2. How it works…
        9. FuzzyQuery
          1. How to do it...
          2. How it works…
        10. NumericRangeQuery
          1. How to do it…
          2. How it works…
        11. DisjunctionMaxQuery
          1. How to do it…
          2. How it works…
        12. RegexpQuery
          1. How to do it…
        13. SpanQuery
          1. How to do it...
          2. How it works…
        14. CustomScoreQuery
          1. How to do it…
          2. How it works…
      14. 7. Flexible Scoring
        1. Introduction
        2. Overriding similarity
          1. How to do it…
          2. How it works…
          3. There's more…
            1. The BM25 model
            2. The language model
            3. The divergence from randomness model
            4. The information-based Model
        3. Implementing the BM25 model
          1. How to do It…
          2. How it works…
        4. Implementing the language model
          1. How to do it…
        5. Implementing the divergence from randomness model
          1. How to do It…
          2. How it works…
        6. Implementing the information-based model
          1. How to do It…
          2. How it works…
      15. 8. Introducing Elasticsearch
        1. Introduction
        2. Getting Elasticsearch
          1. Getting ready
          2. How to do it...
          3. How it works...
          4. There's more…
        3. Creating a new index
          1. How to do it…
          2. How it works…
        4. Predefine field mappings
          1. How to do it...
          2. How it works....
        5. Adding a document
          1. How to do it...
          2. How it works...
        6. Deleting a document
          1. How to do it…
          2. How it works…
        7. Updating a document
          1. How to do it…
          2. How it works…
        8. Performing bulk indexing
          1. How to do it…
          2. How it works…
        9. Searching the index
          1. How to do it...
          2. How it works...
          3. There's more…
        10. Scaling Elasticsearch
          1. How to do it…
          2. How it works…
          3. There's more…
      16. 9. Extending Lucene with Modules
        1. Introduction
        2. Exploring spatial search
          1. Getting ready…
          2. How to do it…
          3. How it works…
          4. There's more…
        3. Implementing joins
          1. Getting ready…
          2. How to do it…
        4. Performing faceting
          1. Getting ready…
          2. How to do it…
          3. How it works…
        5. Implementing grouping
          1. Getting ready…
          2. How to do it…
        6. Employing autosuggest
          1. Getting ready…
          2. How to do it…
        7. Implementing highlighting
          1. Getting ready…
          2. How to do it…
          3. How it works…
      17. Index