Chapter 11. Inside a Shard

In Chapter 2, we introduced the shard, and described it as a low-level worker unit. But what exactly is a shard and how does it work? In this chapter, we answer these questions:

Why is search near real-time?
Why are document CRUD (create-read-update-delete) operations real-time?
How does Elasticsearch ensure that the changes you make are durable, that they won’t be lost if there is a power failure?
Why does deleting documents not free up space immediately?
What do the refresh, flush, and optimize APIs do, and when should you use them?

The easiest way to understand how a shard functions today is to start with a history lesson. We will look at the problems that needed to be solved in order to provide a distributed durable data store with near real-time search and analytics.

Making Text Searchable

The first challenge that had to be solved was how to make text searchable. Traditional databases store a single value per field, but this is insufficient for full-text search. Every word in a text field needs to be searchable, which means that the database needs to be able to index multiple values—words, ...

Get Elasticsearch: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Elasticsearch: The Definitive Guide by

Chapter 11. Inside a Shard

Making Text Searchable

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly