You are previewing ElasticSearch Server.
O'Reilly logo
ElasticSearch Server

Book Description

Create a fast, scalable, and flexible search solution with the emerging open source search server, ElasticSearch

  • Learn the basics of ElasticSearch like data indexing, analysis, and dynamic mapping

  • Query and filter ElasticSearch for more accurate and precise search results

  • Learn how to monitor and manage ElasticSearch clusters and troubleshoot any problems that arise

In Detail

ElasticSearch is an open source search server built on Apache Lucene. It was built to provide a scalable search solution with built-in support for near real-time search and multi-tenancy.

Jumping into the world of ElasticSearch by setting up your own custom cluster, this book will show you how to create a fast, scalable, and flexible search solution. By learning the ins-and-outs of data indexing and analysis, "ElasticSearch Server" will start you on your journey to mastering the powerful capabilities of ElasticSearch. With practical chapters covering how to search data, extend your search, and go deep into cluster administration and search analysis, this book is perfect for those new and experienced with search servers.

In "ElasticSearch Server" you will learn how to revolutionize your website or application with faster, more accurate, and flexible search functionality. Starting with chapters on setting up your own ElasticSearch cluster and searching and extending your search parameters you will quickly be able to create a fast, scalable, and completely custom search solution.

Building on your knowledge further you will learn about ElasticSearch’s query API and become confident using powerful filtering and faceting capabilities. You will develop practical knowledge on how to make use of ElasticSearch’s near real-time capabilities and support for multi-tenancy.

Your journey then concludes with chapters that help you monitor and tune your ElasticSearch cluster as well as advanced topics such as shard allocation, gateway configuration, and the discovery module.

Table of Contents

  1. ElasticSearch Server
    1. Table of Contents
    2. ElasticSearch Server
    3. Credits
    4. About the Authors
    5. Acknowledgement
    6. Acknowledgement
    7. About the Reviewers
    8. www.PacktPub.com
      1. Support files, eBooks, discount offers and more
        1. Why Subscribe?
        2. Free Access for Packt account holders
    9. Preface
      1. What this book covers
      2. What you need for this book
      3. Who this book is for
      4. Conventions
      5. Reader feedback
      6. Customer support
        1. Downloading the example code
        2. Errata
        3. Piracy
        4. Questions
    10. 1. Getting Started with ElasticSearch Cluster
      1. What is ElasticSearch?
        1. Index
        2. Document
        3. Document type
        4. Node and cluster
        5. Shard
        6. Replica
      2. Installing and configuring your cluster
      3. Directory structure
      4. Configuring ElasticSearch
      5. Running ElasticSearch
      6. Shutting down ElasticSearch
      7. Running ElasticSearch as a system service
      8. Data manipulation with REST API
        1. What is REST?
        2. Storing data in ElasticSearch
        3. Creating a new document
        4. Retrieving documents
        5. Updating documents
        6. Deleting documents
      9. Manual index creation and mappings configuration
        1. Index
        2. Types
        3. Index manipulation
        4. Schema mapping
          1. Type definition
          2. Fields
          3. Core types
            1. Common attributes
            2. String
            3. Number
            4. Date
            5. Boolean
            6. Binary
          4. Multi fields
          5. Using analyzers
            1. Out-of-the-box analyzers
            2. Defining your own analyzers
            3. Analyzer fields
            4. Default analyzers
          6. Storing a document source
          7. All field
      10. Dynamic mappings and templates
        1. Type determining mechanism
        2. Dynamic mappings
        3. Templates
          1. Storing templates in files
      11. When routing does matter
        1. How does indexing work?
        2. How does searching work?
        3. Routing
        4. Routing parameters
        5. Routing fields
      12. Index aliasing and simplifying your everyday work using it
        1. An alias
        2. Creating an alias
        3. Modifying aliases
        4. Combining commands
        5. Retrieving all aliases
        6. Filtering aliases
        7. Aliases and routing
      13. Summary
    11. 2. Searching Your Data
      1. Understanding the querying and indexing process
      2. Mappings
        1. Data
      3. Querying ElasticSearch
        1. Simple query
        2. Paging and results size
        3. Returning the version
        4. Limiting the score
        5. Choosing the fields we want to return
          1. Partial fields
        6. Using script fields
          1. Passing parameters to script fields
        7. Choosing the right search type (advanced)
        8. Search execution preference (advanced)
      4. Basic queries
        1. The term query
        2. The terms query
        3. The match query
          1. The Boolean match query
          2. The phrase match query
          3. The match phrase prefix query
        4. The multi match query
        5. The query string query
          1. Lucene query syntax
          2. Explaining the query string
          3. Running query string query against multiple fields
        6. The field query
        7. The identifiers query
        8. The prefix query
        9. The fuzzy like this query
        10. The fuzzy like this field query
        11. The fuzzy query
        12. The match all query
        13. The wildcard query
        14. The more like this query
        15. The more like this field query
        16. The range query
        17. Query rewrite
      5. Filtering your results
        1. Using filters
        2. Range filters
        3. Exists
        4. Missing
        5. Script
        6. Type
        7. Limit
        8. IDs
        9. If this is not enough
        10. bool, and, or, not filters
        11. Named filters
        12. Caching filters
      6. Compound queries
        1. The bool query
        2. The boosting query
        3. The constant score query
        4. The indices query
        5. The custom filters score query
        6. The custom boost factor query
        7. The custom score query
      7. Sorting data
        1. Default sorting
        2. Selecting fields used for sorting
        3. Specifying behavior for missing fields
        4. Dynamic criteria
        5. Collation and national characters
      8. Using scripts
        1. Available objects
        2. MVEL
        3. Other languages
        4. Script library
        5. Native code
      9. Summary
    12. 3. Extending Your Structure and Search
      1. Indexing data that is not flat
        1. Data
        2. Objects
        3. Arrays
        4. Mappings
          1. Final mappings
        5. To be or not to be dynamic
        6. Sending the mappings to ElasticSearch
      2. Extending your index structure with additional internal information
        1. The identifier field
        2. The _type field
        3. The _all field
        4. The _source field
        5. The _boost field
        6. The _index field
        7. The _size field
        8. The _timestamp field
        9. The _ttl field
      3. Highlighting
        1. Getting started with highlighting
        2. Field configuration
        3. Under the hood
        4. Configuring HTML tags
        5. Controlling highlighted fragments
        6. Global and local settings
        7. Require matching
      4. Autocomplete
        1. The prefix query
        2. Edge ngrams
        3. Faceting
      5. Handling files
        1. Additional information about a file
      6. Geo
        1. Mapping preparation for spatial search
        2. Example data
        3. Sample queries
        4. Bounding box filtering
        5. Limiting the distance
      7. Summary
    13. 4. Make Your Search Better
      1. Why this document was found
        1. Understanding how a field is analyzed
        2. Explaining the query
      2. Influencing scores with query boosts
        1. What is boost?
        2. Adding boost to queries
        3. Modifying the score
          1. Constant score query
          2. Custom boost factor query
          3. Boosting query
          4. Custom score query
          5. Custom filters score query
      3. When does index-time boosting make sense
        1. Defining field boosting in input data
        2. Defining document boosting in input data
        3. Defining boosting in mapping
      4. The words having the same meaning
        1. Synonym filter
          1. Synonyms in mappings
          2. Synonyms in files
        2. Defining synonym rules
          1. Using Apache Solr synonyms
            1. Explicit synonyms
            2. Equivalent synonyms
            3. Expanding synonyms
          2. Using WordNet synonyms
        3. Query- or index-time synonym expansion
      5. Searching content in different languages
        1. Why we need to handle languages differently
        2. How to handle multiple languages
        3. Detecting a document's language
        4. Sample document
        5. Mappings
        6. Querying
          1. Queries with a known language
          2. Queries with an unknown language
          3. Combining queries
      6. Using span queries
        1. What is a span?
        2. Span term query
        3. Span first query
        4. Span near query
        5. Span or query
        6. Span not query
        7. Performance considerations
      7. Summary
    14. 5. Combining Indexing, Analysis, and Search
      1. Indexing tree-like structures
      2. Modifying your index structure with the update API
        1. The mapping
        2. Adding a new field
        3. Modifying fields
      3. Using nested objects
      4. Using parent-child relationships
        1. Mappings and indexing
          1. Creating parent mappings
          2. Creating child mappings
          3. Parent document
          4. Child documents
        2. Querying
          1. Querying for data in the child documents
          2. The top children query
          3. Querying for data in the parent documents
        3. Parent-child relationship and filtering
        4. Performance considerations
      5. Fetching data from other systems: river
        1. What we need and what a river is
        2. Installing and configuring a river
      6. Batch indexing to speed up your indexing process
        1. How to prepare data
        2. Indexing the data
        3. Is it possible to do it quicker?
      7. Summary
    15. 6. Beyond Searching
      1. Faceting
        1. Document structure
        2. Returned results
        3. Query
        4. Filter
        5. Terms
        6. Range
          1. Choosing different fields for aggregated data calculation
        7. Numerical and date histogram
          1. Date histogram
        8. Statistical
        9. Terms statistics
        10. Spatial
        11. Filtering faceting results
        12. Scope of your faceting calculation
          1. Facet calculation on all nested documents
          2. Facet calculation on nested documents that match a query
        13. Faceting memory considerations
      2. More like this
        1. Example data
        2. Finding similar documents
      3. Percolator
        1. Preparing the percolator
        2. Getting deeper
      4. Summary
    16. 7. Administrating Your Cluster
      1. Monitoring your cluster state and health
        1. The cluster health API
        2. The indices stats API
          1. Docs
          2. Store
          3. Indexing, get, and search
        3. The status API
        4. The nodes info API
        5. The nodes stats API
        6. The cluster state API
        7. The indices segments API
      2. Controlling shard and replica allocation
        1. Explicitly controlling allocation
          1. Specifying nodes' parameters
          2. Configuration
          3. Index creation
          4. Excluding nodes from allocation
          5. Using IP addresses for shard allocation
        2. Cluster-wide allocation
        3. Number of shards and replicas per node
        4. Manually moving shards and replicas
          1. Moving shards
          2. Canceling allocation
          3. Allocating shards
          4. Multiple commands per HTTP request
      3. Tools for instance and cluster state diagnosis
        1. Bigdesk
        2. elasticsearch-head
        3. elasticsearch-paramedic
        4. SPM for ElasticSearch
      4. Your ElasticSearch time machine
        1. The gateway module
          1. Local gateway
          2. Shared filesystem gateway
          3. Hadoop distributed filesystem gateway
            1. Plugin needed
          4. Amazon s3 gateway
            1. Plugin needed
        2. Recovery control
      5. Node discovery
        1. Discovery types
        2. Master node
          1. Configuring master and data nodes
          2. Master election configuration
        3. Setting the cluster name
        4. Configuring multicast
        5. Configuring unicast
        6. Nodes ping settings
      6. ElasticSearch plugins
        1. Installing plugins
        2. Removing plugins
        3. Plugin types
      7. Summary
    17. 8. Dealing with Problems
      1. Why is the result on later pages slow
        1. What is the problem?
        2. Scrolling to the rescue
      2. Controlling cluster rebalancing
        1. What is rebalancing?
        2. When is the cluster ready?
        3. The cluster rebalancing settings
          1. Controlling when rebalancing will start
          2. Controlling the number of shards being moved between nodes concurrently
          3. Controlling the number of shards initialized concurrently on a single node
          4. Controlling the number of primary shards initialized concurrently on a single node
          5. Disabling the allocation of shards and replicas
          6. Disabling the allocation of replicas
      3. Validating your queries
        1. How to use the Validate API
      4. Warming up
        1. Defining a new warming query
        2. Retrieving defined warming queries
        3. Deleting a warming query
        4. Disabling the warming up functionality
        5. Which queries to choose
      5. Summary
    18. Index