Gossip Protocol

We’ll explore the first approach to maintaining a peer-to-peer cluster - Gossip Protocol .

Eager Reliable Broadcast Protocol

Eager Reliable Broadcast is a communication protocol in distributed systems that ensures messages are broadcast to all participants. In this protocol, each peer continuously exchanges information with every other peer.

This protocol follows an eager delivery model, where messages are immediately sent to all peers upon generation. While effective, this approach becomes inefficient as the number of peers grows, it consumes substantial bandwidth and system resources due to redundant message exchanges.

To address these inefficiencies, the Gossip Protocol was introduced, reducing resource usage by limiting the number of exchanges.

Gossip Protocol

The Gossip Protocol is akin to how rumors spread in an office. A peer starts by sharing a message with a few randomly selected peers. These peers then forward the message to others, eventually reaching all nodes in the cluster.

For example, Server 1 first informs Server 2 and Server 3 of a piece of information. They then propagate it to Server 4. Eventually, all servers in the cluster acknowledge the information.

Maintaining Cluster Membership

Each server maintains information such as addresses, states, and shard allocations of other nodes.

For example:

Adding a Node

Bootstrapping nodes serve as entry points for new nodes, typically accessed via static IPs or DNS.

In this example, A acts as a bootstrap node. When node C joins, it sends its information to A and receives the current cluster metadata.

A then informs B of C’s arrival, allowing B to update its view of the cluster.

Now, all members are aware of C, meaning it has successfully joined the cluster. Based on this information, the system can redistribute data if necessary.

Forwarding

With consistent hashing and shared metadata, any node can act as an interface:

Serving read requests directly or forwarding them to respective replicas.
Routing write requests to the owner node.

Gossip Rounds

Gossip Rounds form the backbone of the protocol. Periodically, each node selects random peers and shares its current state (e.g., heartbeat), progressively spreading updates throughout the cluster.

For example, at time 3, A gossips its state. B receives it and transmits to C.

Later, B also gossips its state to random nodes.

Failure Detection

One key advantage of Gossip Rounds is fault detection. If a node doesn’t receive a heartbeat from another node within a defined time window, it marks that peer as DOWN.

If the heartbeat timeout is 3 seconds, by time 5, both A and B consider C down due to the absence of updates.

Temporary Promotion

When a node is deemed unreachable, the detector stops forwarding data to it and switches to a replica. Once the original node recovers, it synchronizes with the replica to restore data.

Data Conflicts

This architecture emphasizes Availability over Consistency (AP). During network partitions, replicas can be promoted to serve writes temporarily. However, once partitions heal, conflict resolution becomes necessary.

Two effective approaches to conflict resolution are:

Last Write Wins (LWW)

This method uses timestamps to resolve conflicts. Each record tracks the latest update time, and the most recent version wins during a merge.

This strategy relies on clock synchronization. If clocks are skewed, incorrect records might be selected.

Vector Clocks

A more robust alternative, Vector Clocks, avoids dependency on synchronized clocks. Instead, each record tracks a version vector in the format [(Server, Version)].

For example, a record initially receives its version number from its owner shard. Other servers then replicate the record from the owner, preserving this version for consistency.

Then, a network partition occurs, the cluster is divided into three partitions. Each group may update the record independently, expanding the vector clock;

Server 1 updates and increases its version.
Server 3 updates and creates its own version.

A conflict is identified when vectors cannot be merged deterministically:

[(Peer 1, 4)] is a clear successor of [(Peer 1, 3)].
But [(Peer 1, 4)] and [(Peer 1, 3), (Peer 3, 1)] clearly conflict.

The system maintains both versions:

Application-Level Resolution

The application is responsible for resolving conflicts. It receives all conflicting versions and decides how to merge them.

For instance, it may choose to accept the first version:

Since the database lacks context about the business logic, deferring resolution to the application ensures safer and more flexible conflict handling. Thus, applications should include additional metadata in their records to assist in this process.

Last updated on June 26, 2025

Peer-to-peer Architecture Consensus Protocol