Sharding and fault tolerance

We have already seen sharding, collection, and replicas in Chapter 6, Distributed Search Using Apache Solr. In this section, we will look at some of the important aspects of sharding and how it plays a role in scalability and high availability. The strategy to create new shards is highly dependent upon the hardware and shard size. Let's say, you have two machines, A and B, of the same configuration, each with one shard. Shard A is loaded with 1 million index documents, and shard B is loaded with 100 documents. When a query is fired, the query response to any Solr query is determined by the query response of the slowest node (in this case, shard A). Hence, a shard with near to equal shard sizes can perform better in ...

Get Scaling Apache Solr now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.