Understanding the Consensus Problem in Distributed Systems: Raft, Paxo

Distributed system share a common problem which is term consensus problem, it’s the fact that various nodes in a distributed system try to reach an agreement for a specific thing.

In the case of distributed transaction, it’s whether a result has been committed or not
In the case of message delivery, it’s whether the message has been delivered or not.

Now let’s assume we have a distributed system of Nk where N is the nodes and k is the number. and each can propose different values Vk.

The following properties must also be satisfied:

Termination: Every non-faulty node must eventually decide.
Agreement: The final decision of every non-faulty node must be identical.
Validity: The agreed value must have been proposed by one of the nodes.

Some use-cases of consensus

Distributed Locking

Most distributed systems receive multiple concurrent requests and need to perform concurrency control to prevent data inconsistencies because of interference between these requests.

One of these concurrency control methods is locking. However using locking in the context of a distributed system comes with a lot of edge-cases that add a lot of risks.

Distributed locking can be modelled as a consensus problem where the nodes of the system agree on a single value, i.e the node to hold the lock.

Leader Election

Ensuring only one node acts as the leader at a time.

A good example of this problem is the primary-backup replication scheme, where one nodes act as the leader that accepts updates request from the clients and send those request to follower nodes.

However the problem of deciding the leader node can be modelled as a consensus problem because the follower nodes have to agree on a single value that is the leader node.

Raft

Paxos algorithm gave birth to Raft algorithm, Paxos was dubbed to be difficult to understand which led to Raft.

It requires a set of nodes that form the consensus group, which is referred to as the Raft cluster. Each of these nodes can be in one of the three states:

Leader
Follower
Candidate

Node States

One node is elected the leader and is responsible for receiving log entries from clients (proposals) and replicating them to the other follower nodes to reach consensus.

The leader is responsible for sending heartbeats to the other nodes in order to maintain its leadership.

Any node that hasn’t heard from the leader for a while will assume the leader has crashed; it will enter the candidate state and attempt to become the leader by triggering a new election.

On the other hand, if a previous leader identifies another node has gained leadership, it falls back to a follower state.

Terms

To prevent two leaders in a cluster, Raft implemented time with terms.

Raft divides time into terms of arbitrary length. Each term is identified by a number (1, 2, 3, ...).

Each term begins with an election.
If the election is successful, one leader manages the cluster for the rest of the term.
If an election fails (a split vote), the term ends, and a new term (with a new election) begins immediately.
Terms act as a logical clock, allowing servers to detect obsolete information.

I asked AI to give me a good analogy of Raft algorithm and i got this:

The Analogy: Electing a Class President to Manage a Ledger

Imagine a classroom where students need to keep a shared, ordered list of rules (a ledger). Everyone must have the exact same list in the exact same order. To avoid chaos, they decide on a system:

Elect a President (The Leader): Only the President can add new rules to the ledger. Everyone else just listens to the President.
President's Term: The President serves for a "term." If the President becomes unresponsive (e.g., falls asleep), the class holds a new election for a new term.
Proposing a Rule (Log Replication): A student suggests a new rule to the President. The President writes it down in their draft notebook and tells all the other students (the "Followers") to write it in their draft notebooks too.
Confirming a Rule (Committing): Once a majority of students (including the President) have written the rule in their draft notebooks, the President declares the rule "official." They take out a permanent marker and make the rule final in their ledger. They then tell everyone else to do the same.
Consistency: If a new President is elected, they must have all the "official" rules from the previous terms. You can't elect someone who missed a bunch of important, confirmed rules.

Conclusion

I guess we have been able to scratch the surface of consensus which is very complicated, I had to omit a lot of things for brevity of this article, we've scratched the surface of this fascinating topic, exploring how algorithms like Raft and Paxos enable distributed machine to work together as a single, fault-tolerant unit.

We've seen how these principles are not merely academic; they are the bedrock of critical, real-world systems. etcd, the distributed brain of Kubernetes, relies on Raft for its clarity and strong leadership model. In contrast, foundational systems like Google's Chubby are built upon the battle-tested, albeit more complex, Paxos algorithm.

While our journey here was brief, I hope it has demystified the core challenge of distributed agreement. For a hands-on look at these concepts, you can explore my simple implementation of Raft in Java on Gitub.

Consensus in Distributed Systems