CAP Theorem Explained

The CAP Theorem is an observation of a distributed system. It states that a distributed system can only guarantee 2 out of 3 properties: Consistency, Availability and Partition Tolerance.

CAP

Consistency: Every read gets the latest write, or an error
Availability: Every read gets a (non-error) response, even if it’s outdated
Partition Tolerance: The system continues to operate even if a part of the system is not reachable

In practice, we always need partition tolerance in a distributed system. That is because network failures will occur, and a practical system must consider how to operate when they do.

CA vs CP vs AP

Imagine in a system with 2 nodes. A user updates a key x to have the value 3 and the request is handled by node A. Next, a network failure occurs, and node A is not able to propagate this change to node B. A partition has formed between A and B.

At this point, if a user tries to query for the value of x and the request is handled by B, it can either choose to return the value it has in cache (5) and thus choose availability over consistency (AP), or it can return an error since it cannot coordinate with the other node in the system (node A) to agree on a value, thus choosing consistency over availability (CP). There is no way to choose both consistency and availability when a partition occurs.

However, if we say that a system does not have to be tolerant of partitions, then we don’t have to choose between C and A because when the above scenario happens, the system stops operating! In reality, CA is only a practical choice when there is one node (i.e. not a distributed system), or a tightly coupled system where network reliability is assumed.

Consistency Levels

In the CAP theorem, we are speaking about consistency in terms of strong consistency. However, in practice, databases support a range of consistency levels.

Consistency Level	Description	Example use case	Closest pair in CAP
Strong	Every read gets the latest write	Every request to transfer money out of an account will definitely read the latest value of the account.	CP
Sequential	All nodes agree on a global ordering of operations, which might not be chronological — a later operation in real-time can appear earlier in the global order.	Cannot think of a meaningful example in the real world.	CP
Casual	Writes that are casually related are seen in the same order. If one operation causes another, then everyone must see them in the same order.	You cannot see a comment to a post if you have not seen the post.	Mix of CP and AP
Eventual	All nodes converge to the same value, but no ordering guarantees	Shopping carts - let’s say you add items from different devices. It doesn’t matter if one device doesn’t immediately show the items added from the other device, as long as is eventually converges (in an acceptable time range).	AP
Read-your-writes	A user will always read their latest write, but other users might not	A user who just created a post will definitely see their post once it’s created, even if other users have not.	CA

The CAP theorem allows us to understanding the trade-offs in a distributed system, which helps us to decide what properties we need in our application. For one feature, eg. bank transfer, we could use CP, and for another feature in an application, like viewing account balance, it could be AP.

CAP Theorem and consistency levels

Table of contents

CAP

CA vs CP vs AP

Consistency Levels

Subscribe to my newsletter

Software Engineering Blog

Software Engineering Blog