Tuesday, May 4, 2010

CAP Theorem: An informal proof

[Attention: Pending peer review]

Prof. Eric Brewer made a conjecture in the keynote speech at the Principles of Distributed Computing (PODC) conference in 2000. It has since been called the CAP theorem and goes as follows:

A distributed system cannot guarantee all three properties, namely, consistency (C), availability (A), and partition tolerance (P) at the same time in the presence of network failures. It can only guarantee any two of the three.

CAP theorem has important implications for web service designers and users.

This was formally proved by Gilbert and Lynch of MIT in 2002. The formal theorem is nice but hard to understand for mere mortals like us. So I have tried to give a more informal proof that is more comprehensible.

For simplicity, let's assume that the distributed system has just one object O - say, the number of tickets available for a movie show at 1 pm today. So it is just one number. We have built an awesome web service that offers just 2 operations: read ticket count, and update ticket count. The web service runs on a distributed system consisting of 100 nodes scattered across the internet.

Some definitions:

Consistency that web service users expect is linearizability, meaning that if there are simultaneous concurrent reads and writes to the object O, the distributed/parallel system should linearize or sequence these operations in such a way that it appears to the user as executing on a single node. In effect, read operation return the value posted by the last update, and updates happen sequentially one after another. For example, if the initial value of O is 1000 and I update it to 999 via node 53, and the next instant someone on node 72 asks for the value of O, the system should return 999 and not 1000. In other words, the users of the web service don't care if there is single server behind the web service or a massive system distributed across the four corners of the planet. They want to see the updates they made most recently, and that's it.

Consistency comes in many forms. One is strong consistency which says that any subsequent read that follows an update will always see the latest update. The other is weak consistency which says that only reads that come after certain elapsed time (called the inconsistency interval or inconsistency window) will see the update. There is no one form of consistency that is suitable for all applications. Banks need strong consistency because they handle money, but airplane reservations are ok with weak consistency because they allow overbooking. (See Eventually Consistent for better definitions).


Availability means that when a read or update operation is invoked on the system, the system will always complete the operation and respond back with a success or failure. To an observer, the web site is available.

Partition tolerance means that if there is a network partition, then all the updates made to the system during the outage will be captured and propagated to all the nodes when the outage completes. That is, no updates are lost. By network partition, we mean that one or more nodes are unable to communicate with one or more other nodes in the distributed system. As a result, and during a network partition, the inconsistency interval increases dramatically - many times the normal inconsistency interval.

Informal proof:

Case 1: A and P are satisfied but C is not.

Let's assume that there is a network partition, i.e., some set of nodes in the distributed systems cannot communicate with another set of nodes, but each node is alive and well.

We will now show that either one of C, A or P is not possible or we arrive at a contradiction.

Let's say the user invokes the web service to make an update during the network partition.

The system now has two choices: to allow the update or not allow the update. If the system decides to not allow the update, then the system is not Available (A).

If the system decides to allow the update, then the system is Available. However, the question is - will the update reach all the nodes?

If the update does not reach all the nodes, then the system is not partition tolerant (P).

However, if the update reaches all the nodes, the system is partition tolerant (P). But when do the updates reach the nodes?

If the updates reach instantly or within the inconsistency interval, then the network is ok and there is no network partition - a contradiction to our assumption we made at the start of the proof.

If the updates do not reach all the nodes instantly or within the inconsistency interval, then some nodes in the system have not been informed of the update. Thus the value of O is different on different nodes after the inconsistency interval has elapsed. Therefore the system is not Consistent (C).


Case 2: If A and C are satisfied by the system, then P is not satisfied.

Let's say there is a network partition in progress. Let's say the user invokes the web service to make an update during the network partition.

The system now has two choices: to allow the update or not allow the update. If the system decides to not allow the update, then the system is not Available (A).

If the system decides to allow the update, then the system is Available. However, the question is - will the update be reflected on all the nodes?

If the update is not reflected on all the nodes, then the system cannot be consistent (C).

If the update is reflected on all the nodes, then the system is consistent (C). However, this implies that communication was possible between the nodes. This contradicts our assumption that there was a network partition.

This implies that the update made during the network partition was not reflected on all the nodes. Since the system is consistent (C), and the only option for it to remain consistent is to abandon the update made during the network partition. This means the system is not Partition Tolerant (P).

Case 3: If C and P are satisfied by the system, A is not satisfied.

Let's say there is a network partition, and let's assume the system is consistent (C).

Since the system is consistent, it must ensure that all the updates are made available to all the nodes.

When there is a network partition, the system has two choices: to ignore the updates made during the network outage, or to save them for later updates.

If the system ignores the updates made during the network outage, then it is not partition tolerant (P).

Therefore the system must save the updates for propagating them to all the nodes when the network outage is done.

Given that the system is consistent (C), it cannot allow reads to proceed anywhere in the system as long as there are pending updates to be propagated.

If the systems allows the reads to proceed before the updates are propagated, then the reads will see old values and not the latest values waiting to be propagated, thus contradicting consistency (C) assumption.

Therefore it must not respond to the read requests coming into the system. This means the system is not Available (A).

End of Informal proof.


Amazon's web services make various tradeoffs between A and C given P. Interested readers can read Werner Vogels' article on being Eventually Consistent. It is vital for web developers to understand the limitations of web services so that they can build more reliable applications that depend on web services.

No comments:

Post a Comment