Virtual nodes facilitate redistribution

The main advantage of virtual nodes is their behavior when the cluster changes. Consider the simple example of adding a fourth node to our three-node cluster. Without virtual nodes, all three pre-existing physical nodes have their token range changed to make space for the fourth node. To accommodate the fourth machine, Cassandra must recalculate the target node for each individual row based on the mapping from its token to the new token range assignments of the four nodes. This is a process known as rebalancing and it's rendered unnecessary by virtual nodes.

When a new node joins the cluster, it's simply assigned a handful of virtual nodes that previously belonged to other machines. Rather than directly ...

Get Learning Apache Cassandra - Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.