DDIA - Chapter 9 - Consistency and Consensus - Thoughts and notes

Vivek KhatriVivek Khatri
2 min read

Disclaimer: this blog has been written with the help of LLM, please do not hate me

I never imagined that consensus would be such a difficult problem, in a democracy one has consensus right, it is so easy. Even if there are multiple failure points the leader gets elected.

Anyway, this chapter speaks a lot about linearisability, consistency, consensus, how to be fault tolerant during consensus. Thing is that the tradeoff with making the system consistent is performance. There wasn’t much to discover in this chapter, I did find it a bit boring. The basic idea while building a distributed system should be that time is linear. I’d always use epochs.

Linearizability: Core Concepts

Definition

  • Makes a distributed system appear as a single replica without replication lag

  • Ensures all clients see the most recent value once it's written

  • Requires consistent data across all replicas at the same time

Key Characteristics

  • Strict consistency model

  • Once a client sees a new value, all other clients must see that same new value

  • Critical for scenarios like:

    • Leader election in single-leader systems

    • Choosing unique usernames

    • Seat booking systems

Trade-Offs

  • Performance-intensive

  • Often sacrificed for availability

  • Costly in multi-data center environments

  • Not many systems provide true linearizability

Limitations

  • High performance overhead

  • Difficult to implement across distributed systems

  • Network delays can significantly impact implementation

Distributed Transactions

Types

  1. Database-internal transactions

    • Used within same database system

    • Easier to implement

  2. Heterogeneous distributed transactions

    • Involve multiple, different systems

    • Focus on ensuring atomic commits

Key Mechanism: Two-Phase Commit (2PC)

  • Two phases: Prepare and Commit

  • Transaction coordinator manages process

  • Ensures all nodes commit or abort together

Challenges

  • Performance bottlenecks

  • Complexity of coordinating across systems

  • Potential for system-wide blocking if coordinator fails

Consensus

Definition

Getting multiple nodes to agree on a value with specific properties:

  • Uniform agreement

  • Integrity

  • Validity

  • Termination

Key Characteristics

  • Requires majority of nodes to be functioning

  • Uses epoch numbering for leader selection

  • Implements total order broadcast

Limitations

  • Requires strict majority of nodes

  • Difficult to add/remove nodes dynamically

  • Relies on timeout mechanisms

  • Performance can be impacted by frequent leader elections

Coordination Services (e.g., Zookeeper, etcd)

Features

  • Linearizable atomic operations

  • Total ordering of operations

  • Failure detection

  • Change notifications

  • Service discovery

0
Subscribe to my newsletter

Read articles from Vivek Khatri directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vivek Khatri
Vivek Khatri

I am still deciding what should I write here.