Two Phase Locking
In our previous blog post, we discussed the problems related to concurrency control in database management systems. Now, we will explore a solution to these problems using the two-phase locking protocol. This technique helps prevent conflicting transactions from accessing the same data simultaneously.
Lock
We need a way to guarantee that all execution schedules are correct without knowing their schedule of execution.
The solution is to use LOCK to protect database objects. These objects can be a tuple (row), an attribute, or even a table.
Each transaction has to acquire a lock on the object before using it.
let's take an example without and with LOCK.
Suppose there are 2 transactions (T1 and T2) on attribute "A" at the same time.
** without LOCKS **
T1 T2
BEGIN
R(A) -> $10
BEGIN
R(A) -> $10
W(A) -> $20
A = A * 2
W(A) -> $20
COMMIT
COMMIT
Here the final value will depend on the schedule. T2 has
read the value $10 but when it's doubling it. it was
updated by T1. So It's doubling a wrong value!
Remember! Now each transaction should acquire lock on
the objects before making any operation. and when
a transaction want to lock an object which already
locked by another transaction. it has to wait until
it's unlocked!
** with LOCKS **
T1 T2
BEGIN
LOCK(A)
R(A) -> $10
BEGIN
LOCK(A) -> refused(wait)
W(A) -> $20 .
UNLOCK(A) -> now T2 can use it .
LOCK(A)
R(A) -> $20
A = A * 2
W(A) -> $40
UNLOCK(A)
COMMIT
COMMIT
Now the final value is true with LOCKS.
Locks Demo video with PostgreSQL
Basic Lock Types
S-LOCK: Shared locks for reads
X-LOCK: Exclusive lock for writes.
The S-lock allows multiple transactions to read an object simultaneously without waiting, while the X-lock does not allow multiple writes. If a transaction has acquired an X-lock on an object, no other transaction can read from or write to it until the lock is released.
Let's take a new example with the types:
T1 T2
BEGIN
X-LOCK(A)
R(A) -> $10
W(A) -> $20
UNLOCK(A)
BEGIN
X-LOCK(A)
R(A) ->$20
A = A*2
W(A) ->$40
UNLOCK(A)
S-LOCK(A)
R(A) -> $40
UNLOCK(A)
COMMIT
COMMIT
Here T1 requests an X-LOCK first because it will write
on 'A'. then it release it because there is no need
for it again. we just need a S-LOCK in the next
operation. but based on the schedule. T2 wrote on 'A'
before T1 asks for the second operation. now T1 reads 'A'.
but it read a wrong value? it was expected to read $20.
So that's a problem!!. let's discuss a solution for it.
Two-Phase Locking
It's a concurrency control protocol in database management systems. The protocol ensures transaction serializability by dividing the execution of a transaction into two phases - the growing phase and the shrinking phase.
Phase #1: Growing
- During this phase, the transaction acquires locks on the data items it needs to access.
Phase #2: Shrinking
This phase is fired immediately once any lock is released.
During this phase. you can never acquire a lock. so the DBMS has to acquire any LOCK it needs for the transaction before it releases anyone.
As you see in the graph above. the number of locks is increasing in the Growing Phase. but once we entered the Shrinking Phase. The number of LOCKs is decreasing while we are releasing them.
The graph shown above violates the rules of the Two-Phase Locking protocol. The number of locks should not increase during the Shrinking Phase, as this can lead to concurrency-related issues. It's important to always follow the rules of the protocol to ensure transaction serializability and correctness.
Let's execute our last example but now with a 2-phase locking protocol
T1 T2
BEGIN
X-LOCK(A)
R(A) -> $10
W(A) -> $20
**don't release the lock**
BEGIN
X-LOCK(A) -> refused(wait)
R(A) -> $20
now we can release
UNLOCK(A)
now T2 will start executing
X-LOCK(A)
R(A) -> $20
A = A*2
W(A) -> $40
COMMIT
COMMIT
Now the problem above fixed but 2-phase locking. T1 read the
value it should read. and T2 executed also. so the 2
transactions are executed successfully.
but!
What if T1 is ABORTED for any reason?
now T2 updated 'A' with a value based on T1 modification
which is an ABORTED transaction. That means that T2 also has
to ABORT.
That problem is called cascading aborts. It may lead to poor
performance because both transactions have to restart.
So one of the disadvantages of 2-phase locking is cascading aborts.
The solution for that is
Strong Strict Two-Phase Locking
It's similar to basic 2-phase locking. There is one key difference. All locks are kept until the end of the transaction. we won't release any of them until the end. This means that no other transaction can access the locked objects until the current transaction is completed.
Let's take the example above but with Strong Strict 2PL:
Assume we have 2 bank accounts
T1 T2
BEGIN
X-LOCK(A)
R(A) -> $10
W(A) -> $20
BEGIN
X-LOCK(A) -> refused(wait)
R(A) -> $20
COMMIT
UNLOCK(A)
X-LOCK(A)
R(A) -> $20 or $10
A = A*2
W(A) -> $40 or $20
COMMIT
Here T2 will always get the true value even if T1 ABORTED.
because it won't be able to see the object 'A' until the
transaction finishes. if the transaction ended with COMMIT
it will see the new value (20) and if it's ended with ABORT
it will see the old one (10).
Considering that nothing is perfect, do you feel that this
is the best solution available? There is nothing in life
without drawbacks :), it may lead to a deadlock. We can look
at an example to further understand this issue.
suppose there are 2 bank accounts 'A' and 'B'
T1 T2
BEGIN
X-LOCK(A)
BEGIN
S-LOCK(B)
R(B)
S-LOCK(A) -> refused(wait)
R(A)
X-LOCK(B) -> refused(wait)
W(B)
The current issue is that Transaction 1 (T1) requires an
exclusive lock (X-LOCK) on 'B', but cannot obtain it because
Transaction 2 (T2) has already acquired a shared lock
(S-LOCK) on it and has not yet released it. Meanwhile, T2
needs a shared lock on 'A', but T1 has already acquired an
exclusive lock on it and has not yet released it, resulting
in T2 being blocked and waiting for T1 to release the lock.
Since both transactions are required to release all locks at
the end of the transaction per the Strong Strict 2PL
protocol, neither can release their locks and both
transactions remain blocked indefinitely, resulting in
a deadlock.
Deadlock Handling
Deadlock Detection
one solution that most Database systems provide is deadlock detection. every Database checks periodically for deadlocks. and if it found one. it will select one of the transactions and restart it. so that will allow the second one to complete its work.
The selection criteria differ from system to system.
it may be selected by the transaction progress.
may by age. like the oldest one has to complete and the newest has to restart.
some systems may roll back some operations. not the whole transaction.
Deadlock Prevention
When a transaction tries to acquire a lock that is held by another transaction, the Database System kills one of them to prevent a deadlock.
There are 2 approaches here:
Wait-Die (Old waits for Young)
- If a transaction is requesting a lock on a locked object. the older one will be aborted (The one which has the lowest timestamp).
Wound-Wait (Young waits for Old)
- If a transaction is requesting a lock on a locked object. the newer one will be aborted (The one which has the highest timestamp).
Let's take an Example:
Let's assume that T1 came before T2 so it has lower
timestamp.
T1 T2
BEGIN
X-LOCK(A)
BEGIN
X-LOCK(A)
R(A)
The first approach (Old waits for Young) will abort T1 and
make T2 continue its work.but the second one (Young waits
for Old) will abort T2 and let T1 continue.
Recap
Basic 2-Phase Locking
Advantages:
- Ensures serializability: 2PL ensures that transactions are executed in a serializable fashion, which means that the final state of the database is the same as if the transactions were executed serially, one after the other.
Disadvantages:
- Cascading aborts: If a transaction fails, it may have to roll back, which can cause other transactions that depend on its changes to also roll back. This is known as cascading rollbacks and can lead to poor performance.
Strong Strict 2-Phase Locking
Advantages:
Provides stronger guarantees of serializability.
Solve Cascading Aborts problem: that lead to better performance.
Provides better isolation: It provides better isolation between transactions, ensuring that the changes made by one transaction are not visible to others until it is committed. because it doesn't release locks until it's committed.
Disadvantages:
It may lead to deadlocks.
Can lead to starvation: where some transactions are blocked indefinitely, waiting for locks to be released by other transactions.
Conclusion
Locks and 2-phase locking protocol are essential tools for concurrency control in database management systems. Locks provide a mechanism for controlling access to shared resources, ensuring that only one transaction can access a resource at a time. The 2 phase locking protocol is a widely used technique for ensuring transactional consistency, in which transactions acquire locks in two phases - the growing phase and the shrinking phase.
Basic 2-phase locking provides a simple mechanism for concurrency control, while Strong Strict 2 Phase Locking provides stronger guarantees of transactional consistency. However, both protocols have their advantages and disadvantages, and the choice between them should be based on the specific requirements of the database system and the nature of the transactions being executed.
Subscribe to my newsletter
Read articles from Diaa Badr directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by