Study notes from DS201: Foundations of Apache Cassandra™ and DataStax Enterprise.
Gossip Protocol
Cassandra uses the Gossip protocol to share information between nodes in a cluster and maintain the cluster’s state. The Gossip protocol is a distributed communication mechanism where nodes periodically communicate with each other, exchanging their own state information and information obtained from other nodes to propagate information throughout the entire cluster.
How the Gossip Protocol Works
- Periodic information exchange: Each node periodically sends “gossip” messages to randomly selected other nodes in the cluster.
- Information propagation: A node compares its own information (e.g., its state, load, schema version, etc.) with information received from other nodes. If newer information is available, it updates its own information and propagates the new information to other nodes.
- Efficient information delivery: Rather than sending all information, only changed or newest information is efficiently transmitted. This minimizes network bandwidth consumption while ensuring information is quickly disseminated throughout the cluster.
- Failure detection: The Gossip protocol also exchanges node heartbeat information. This allows other nodes to determine that a node is “down” when it stops responding, and that information is propagated throughout the cluster.
The Gossip protocol is the foundation for maintaining cluster health and consistency in Cassandra’s distributed architecture, supporting node joins/departures, failure detection, schema synchronization, and load information sharing.
Exercises
nodetool gossipinfo Command
The nodetool gossipinfo command displays information about nodes in the cluster that the local node has collected through the Gossip protocol.
$ nodetool gossipinfo
localhost/127.0.0.1
generation:1686366680
heartbeat:8156
STATUS:19:NORMAL,-1868919513406135542
LOAD:8135:1.2650205E7
SCHEMA:223:8aec9840-06b7-356a-b5ed-07e43a42d65e
DC:9:datacenter1
RACK:11:rack1
RELEASE_VERSION:6:4.1.2
RPC_ADDRESS:5:127.0.0.1
NET_VERSION:2:12
HOST_ID:3:349d6a93-038a-45a9-bd86-cc22ed3d8e0d
RPC_READY:21:true
NATIVE_ADDRESS_AND_PORT:4:127.0.0.1:9042
STATUS_WITH_PORT:18:NORMAL,-1868919513406135542
SSTABLE_VERSIONS:7:big-nb
TOKENS:17:<hidden>
This output shows that each node’s state is represented as key-value pairs. For example, STATUS is the node’s operational state (such as NORMAL), LOAD is the node’s load, DC is the data center, and RACK is the rack information.
In a cluster with multiple nodes, running nodetool gossipinfo on one node shows information about other nodes that it has learned through the Gossip protocol. This allows you to understand the overall health of the cluster.