Choosing the right amount of shards and replicas

If you have a limited dataset and the dataset grows by a small amount, you can use only a single primary shard with a replica. If your dataset is not limited and grows by a large amount, the optimal number of shards is dependent on the target number of nodes.

Actually, a single node can be sufficient for many simple use cases, but to reduce the fault tolerance when considering the nature of distributed architecture and to prevent data loss, you can use more than one node. So, we need to find the answer to the first question: How many nodes will work?

Even to answer this question, we need to find out the answers to a few questions. For example: Do we need to use the non-data node? If we don't need ...

Get Elasticsearch Indexing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.