You are previewing Apache Solr Essentials.
O'Reilly logo
Apache Solr Essentials

Book Description

Leverage the power of Apache Solr to create efficient search applications

In Detail

Search is everywhere. Users always expect a search facility in mobile or web applications that allows them to find things in a fast and friendly manner.

Apache Solr Essentials is a fast-paced guide to help you quickly learn the process of creating a scalable, efficient, and powerful search application. The book starts off by explaining the fundamentals of Solr and then goes on to cover various topics such as data indexing, ways of extending Solr, client APIs and their indexing and data searching capabilities, an introduction to the administration, monitoring, and tuning of a Solr instance, as well as the concepts of sharding and replication. Next, you'll learn about various Solr extensions and how to contribute to the Solr community. By the end of this book, you will be able to create excellent search applications with the help of Solr.

What You Will Learn

  • Index your data using formats such as XML, JSON, and CSV

  • Manage, monitor, and tune a Solr instance

  • Deploy Apache Solr in different environments, depending upon your project requirements

  • Refine your search with various Solr client APIs

  • Create custom components by leveraging the Apache Solr extension points

  • Understand and utilize replication and sharding methods in a distributed search system

  • Create and customize your own Solr instance for your project

  • Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

    Table of Contents

    1. Apache Solr Essentials
      1. Table of Contents
      2. Apache Solr Essentials
      3. Credits
      4. About the Author
      5. Acknowledgments
      6. About the Reviewers
      7. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      8. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      9. 1. Get Me Up and Running
        1. Installing a standalone Solr instance
          1. Prerequisites
          2. Downloading the right version
          3. Setting up and running the server
        2. Setting up a Solr development environment
          1. Prerequisites
          2. Importing the sample project of this chapter
          3. Understanding the project structure
          4. Different ways to run Solr
            1. Background server
            2. Integration test server
        3. What do we have installed?
          1. Solr home
          2. solr.xml
          3. schema.xml
          4. solrconfig.xml
          5. Other resources
        4. Troubleshooting
          1. UnsupportedClassVersionError
          2. The "Failed to read artifact descriptor" message
        5. Summary
      10. 2. Indexing Your Data
        1. Understanding the Solr data model
          1. The document
          2. The inverted index
          3. The Solr core
          4. The Solr schema
            1. Field types
              1. The text analysis process
              2. Char filters
              3. Tokenizers
              4. Token filters
              5. Putting it all together
              6. Some example field types
                1. String
                2. Numbers
                3. Boolean
                4. Date
                5. Text
                6. Other types
            2. Fields
              1. Static fields
              2. Dynamic fields
              3. Copy fields
            3. Other schema sections
              1. Unique key
              2. Default similarity
        2. Solr indexing configuration
          1. General settings
          2. Index configuration
          3. Update handler and autocommit feature
          4. RequestHandler
          5. UpdateRequestProcessor
        3. Index operations
          1. Add
            1. Sending add commands
          2. Delete
          3. Commit, optimize, and rollback
        4. Extending and customizing the index process
          1. Changing the stored value of fields
          2. Indexing custom data
        5. Troubleshooting
          1. Multivalued fields and the copyField directive
          2. The copyField input value
          3. Required fields and the copyField directive
          4. Stored text is immutable!
          5. Data not indexed
        6. Summary
      11. 3. Searching Your Data
        1. The sample project
        2. Querying
          1. Search-related configuration
          2. Query analyzers
          3. Common query parameters
            1. Field lists
            2. Filter queries
        3. Query parsers
          1. The Solr query parser
            1. Terms, fields, and operators
            2. Boosts
            3. Wildcards
            4. Fuzzy
            5. Proximity
            6. Ranges
          2. The Disjunction Maximum query parser
            1. Query Fields
            2. Alternative query
            3. Minimum should match
            4. Phrase fields
            5. Query phrase slop
            6. Phrase slop
            7. Boost queries
            8. Additive boost functions
            9. Tie breaker
          3. The Extended Disjunction Maximum query parser
            1. Fielded search
            2. Phrase bigram and trigram fields
            3. Phrase bigram and trigram slop
            4. Multiplicative boost function
            5. User fields
            6. Lowercase operators
          4. Other available parsers
        4. Search components
          1. Query
          2. Facet
            1. Facet queries
            2. Facet fields
            3. Facet ranges
            4. Pivot facets
            5. Interval facets
          3. Highlighting
            1. Standard highlighter
            2. Fast vector highlighter
            3. Postings highlighter
          4. More like this
          5. Other components
        5. Search handler
          1. Standard request handler
            1. Search components
            2. Query parameters
          2. RealTimeGetHandler
        6. Response output writers
        7. Extending Solr
          1. Mixing real-time and indexed data
          2. Using a custom response writer
        8. Troubleshooting
          1. Queries don't match expected documents
          2. Mismatch between index and query analyzer
          3. No score is returned in response
        9. Summary
      12. 4. Client API
        1. Solrj
          1. SolrServer – the Solr façade
          2. Input and output data transfer objects
          3. Adds and deletes
          4. Search
        2. Other bindings
        3. Summary
      13. 5. Administering and Tuning Solr
        1. Dashboard
          1. Physical and JVM memory
          2. Disk usage
          3. File descriptors
        2. Logging
        3. Core Admin
        4. Java properties and thread dump
        5. Core overview
          1. Caches
        6. Cache life cycles
          1. Cache sizing
          2. Cached object life cycle
          3. Cache stats
          4. Types of cache
            1. Filter cache
            2. Query Result cache
            3. Document cache
            4. Field value cache
            5. Custom cache
          5. Query handlers
          6. Update handlers
        7. JMX
        8. Summary
      14. 6. Deployment Scenarios
        1. Standalone instance
        2. Shards
        3. Master/slaves scenario
        4. Shards with replication
        5. SolrCloud
          1. Cluster management
          2. Replication factor, leaders, and replicas
          3. Durability and recovery
          4. The new terminology
          5. Administration console
          6. Collections API
          7. Distributed search
          8. Cluster-aware index
        6. Summary
      15. 7. Solr Extensions
        1. DataImportHandler
          1. Data sources
          2. Documents, entities, and fields
          3. Transformers
          4. Entity processors
          5. Event listeners
        2. Content Extraction Library
        3. Language Identifier
        4. Rapid prototyping with Solaritas
        5. Other extensions
          1. Clustering
          2. UIMA Metadata Extraction Library
          3. MapReduce
        6. Summary
      16. 8. Contributing to Solr
        1. Identifying your needs
          1. An example – SOLR-3191
        2. Subscribing to mailing lists
        3. Signing up on JIRA
        4. Setting up the development environment
          1. Version control
          2. Code style
          3. Checking out the code
          4. Creating the project in your IDE
        5. Making your changes
        6. Creating and submitting a patch
        7. Other ways to contribute
          1. Documentation
          2. Mailing list moderator
        8. Summary
      17. Index