Anti-entropy repair

Anti-entropy repair is triggered manually. Anti-entropy repair is very useful and is often recommended to be run periodically to keep data in sync. Often, hints and read-repair mechanisms are not sufficient to keep data in sync.

Cassandra accomplishes anti-entropy repair using Merkle trees, similar to Dynamo and Riak. Anti-entropy is a process of comparing the data of all replicas and updating each replica to the newest version. Cassandra has three phases to the process:

  • Building a Merkle tree for each replica
  • Comparing the Merkle trees to discover differences
  • Streaming the relevant data

Why is running anti-entropy repair frequently so important? Consider a cluster with a replication factor of 3. Suppose a partition ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.