When a node goes down

In a cluster of any significant size, nodes are bound to become unresponsive for a variety of reasons. Fortunately, Cassandra has a sophisticated mechanism called the failure detector that is designed to determine when this has occurred, then mark the node as down.

Most node failures result from temporary conditions, such as network issues. Therefore, Cassandra assumes the node will eventually come back online, and that permanent cluster changes will be executed explicitly using nodetool.

Marking a downed node

Each node keeps track of the state of other nodes in the cluster by means of an accrual failure detector (or phi failure detector). This detector evaluates the health of other nodes based on a sliding window of gossip ...

Get Cassandra High Availability now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.