Glossary

I can’t believe what a bunch of nerds we are. We’re looking up “money laundering” in a dictionary.

Peter, Office Space

This glossary provides definitions of some of the terms that are important to understand when working with Apache Cassandra. There’s some really good material at http://wiki.apache.org/cassandra, but reading it for the first time can be tricky, as each new term seems to be explained only with other new terms. Many of these concepts are daunting to beginning or even intermediate web developers or database administrators, so they’re presented here in an easy reference. Much of the information in this glossary is repeated and expanded upon in relevant sections throughout this book.

Anti-Entropy

Anti-entropy, or replica synchronization, is the mechanism in Cassandra for ensuring that data on different nodes is updated to the newest version.

Here’s how it works. During a major compaction (see Compaction), the server initiates a TreeRequest/TreeResponse conversation to exchange Merkle trees with neighboring nodes. The Merkle tree is a hash representing the data in that column family. If the trees from the different nodes don’t match, then they have to be reconciled (or “repaired”) in order to determine the latest data values they should all be set to. This tree comparison validation is the responsibility of the org.apache.cassandra.service.AntiEntropyService class. AntiEntropyService implements the Singleton pattern and defines the static Differencer class as well. ...

Get Cassandra: The Definitive Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.