Introduction to Distributed Transactions

ChidinmaChidinma
12 min read

Transactions

Before we look into distributed transactions, it is important we understand transactions.

What is a transaction?

A transaction is a set of operations executed as a single, indivisible unit of work on a database ensuring ACID (Atomicity, Consistency, Isolation, and Durability) properties. This simply means that a transaction is a way for an application to group several reads and writes together into a logical unit. The entire transaction either succeeds (commit) or fails (abort). If the entire transaction fails, it can be retried without having to worry about partial failures - for example, cases where some reads and writes succeed but others fail - ensuring that either all operations succeed or none do.

Why are transactions important?

Transactions are important because they provide safety guarantees through the ACID properties, which stand for Atomicity, Consistency, Isolation, and Durability. This acronym was coined by Theo Härder and Andreas Reuter in 1983.

Atomicity

Recall that an atom is the smallest unit of matter that retains the properties of an element. In that same vein, atomicity is an indivisible unit of work in a transaction. Just as an atom cannot be split without changing its essential nature, a transaction's atomicity ensures that all operations within it are treated as a single, indivisible unit. Atomicity guarantees that either all of the operations in a transaction are completed or none of them take effect. If any part of the transaction fails, the entire transaction fails and is rolled back, maintaining the system's integrity. Atomicity ensures that we do not encounter partial failure or partial complete scenarios; the transaction either fully succeeds or fully fails.

Scenario

For example, let's consider a banking system where a user wants to transfer €100 from Account A to Account B. This transaction involves two operations:

  1. Debit €100 from Account A,

  2. Credit €100 to Account B.

Scenario 1: Successful Transaction

  • Begin Transaction

  • €100 is debited from Account A.

  • €100 is credited to Account B.

  • The transaction is committed.

After the transaction is committed, the balances in the two account are as follows:

  1. Account A: Original balance - €100

  2. Account B: Original balance + €100

BEGIN TRANSACTION;

-- Deduct €100 from Account A
UPDATE accounts SET balance = balance - 100 WHERE id = 'A';
-- Add €100 to Account B
UPDATE accounts SET balance = balance - 100 WHERE id = 'B';

COMMIT; -- Finalize transaction

Scenario 2: Failed Transaction

  • Begin Transaction

  • €100 is debited from Account A.

  • An error occurs before crediting €100 to Account B (e.g., a network failure, system crash, or validation error).

  • The transaction is rolled back.

After the transaction is rolled back, the balances are as follows:

  1. Account A: Original balance (no change, since the debit operation was undone).

  2. Account B: Original balance (no change, since the credit operation never occurred).

BEGIN TRANSACTION;

-- Deduct €100 from Account A
UPDATE accounts SET balance = balance - 100 WHERE account_id = 'A';
-- An error occurs here, so the next line is never executed
UPDATE accounts SET balance = balance + 100 WHERE account_id = 'B';

ROLLBACK; -- Undo all operations

Without atomicity, transactions can result in partial updates, leading to data inconsistencies and integrity issues.

Consistency

Consistency guarantees that a transaction transitions the database from a valid state to another while maintaining the databases' predefined rules. These predefined rules are specific to the application and they define what constitutes a valid state.

While the database guarantees Atomicity, Isolation, and Durability, Consistency is primarily the responsibility of the application. The database can enforce constraints and triggers (e.g., foreign keys, unique constraints), but it is up to the application to define and uphold the validity of the data. An example of an application defined constraint is: "account balances cannot be negative" or "inventory levels cannot be below zero". We can say that while the database provides the mechanisms to maintain atomicity, isolation, and durability, both the application and the database share the responsibility of maintaining consistency.

Scenario

For example, let's process a product purchase. The goal is to ensure that the stock levels do not go negative when reducing stock and recording the sale.

Assume we have two tables: Products and Sales

CREATE TABLE Products (
    ProductID INT PRIMARY KEY,
    ProductName VARCHAR(255),
    Stock INT
);

CREATE TABLE Sales (
    SaleID INT PRIMARY KEY AUTO_INCREMENT,
    ProductID INT,
    Quantity INT,
    SaleDate DATETIME,
    FOREIGN KEY (ProductID) REFERENCES Products(ProductID)
);
-- Start the transaction
BEGIN TRANSACTION;

-- Check stock level
SET @stock_level = (SELECT Stock FROM Products WHERE ProductID = 1);

IF @stock_level < 20 THEN
    -- Rollback the transaction if stock level is insufficient
    ROLLBACK;
ELSE
    -- Reduce stock
    UPDATE Products
    SET Stock = Stock - 20
    WHERE ProductID = 1;

    -- Record the sale
    INSERT INTO Sales (ProductID, Quantity, SaleDate) VALUES (1, 20, NOW());

    -- Commit the transaction
    COMMIT;
END IF;

Scenario 1: Successful Transaction

  • Begin Transaction

  • Check stock level.

  • The stock level is sufficient, reduce stock.

  • Record the sale in the sales table.

  • The transaction is committed.

Scenario 2: Failed Transaction

  • Begin Transaction

  • Check stock level.

  • The stock level is insufficient, roll back the transaction.

Without consistency, data integrity rules may be violated, causing invalid data states.

Isolation

Isolation ensures that transactions are executed independently without interference. Each transaction appears as if it is the only one running, preventing concurrent transactions from affecting each other’s states.

Consider an online banking system. Two users, Nemi and Shalini, are transferring money from their accounts to a shared savings account simultaneously. Isolation guarantees that Nemi's transaction doesn't see Shalini's transaction in progress and vice versa. If Nemi transfers €400 and Shalini transfers €200, isolation ensures the final balance reflects both transactions correctly, without either transaction interfering with the other. Without isolation, if Nemi's transaction reads the balance while Shalini's transaction is halfway, it might result in incorrect final balances.

As you can imagine, in large systems with busy databases, isolation becomes crucial as multiple transactions occur concurrently.

Isolation solves several problems such as:

  • Preventing Dirty Reads. This happens when transactions read uncommitted changes from other transactions. Preventing dirty reads avoids potential errors from temporary data.

  • Preventing Non-Repeatable Reads: Ensures that once a transaction reads a value, it will see the same value if it reads again, preventing inconsistencies from concurrent updates.

  • Preventing Phantom Reads: Prevents new rows from being added or existing rows from being deleted by other transactions during a transaction, ensuring stable query results.

  • Preventing Lost Updates: Prevents multiple transactions from overwriting each other's changes.

  • Preventing Inconsistent Retrievals: Ensures consistent data retrieval by isolating transactions.

  • Preventing Uncommitted Data Visibility: Prevents transactions from seeing intermediate states of other transactions, ensuring only committed data is visible.

Levels of Isolation

Isolation levels control the degree to which the transactions are isolated from each other. There are four main levels:

  1. Read Uncommitted: Transactions can see uncommitted changes made by other transactions. This level offers the fastest isolation level due to minimal locking... a transaction doesn't need to wait for other transactions to finish. Another pro is low resource consumption. However, this can lead to dirty reads which in turn leads to data inconsistencies and anomalies.

    Scenario: Account balance is €250. One transaction reads data that is being modified by another uncommitted transaction (Dirty read).

     -- Transaction 1: Update account (but don't commit yet)
     SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
     BEGIN TRANSACTION;
     UPDATE accounts SET balance = balance - 100 WHERE account_id = 10;
    
     -- Transaction 2: Read the uncommitted data
     SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; -- Can see uncommitted changes
     -- Output: 150.00 (uncommitted data)
     COMMIT;
    
     -- Transaction 1: Commit the update
     COMMIT;
    
  2. Read Committed: Transactions can only see changes committed by other transactions. This level of isolation provides better performance than higher isolation levels due to fewer locks. Prevents dirty reads (only committed changes are visible) but allows non-repeatable reads and phantom reads

    Scenario: Account balance is €250. A transaction reads only committed data, preventing dirty reads.

     -- Transaction 1: Update account (but don't commit yet)
     SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
     BEGIN TRANSACTION;
     UPDATE accounts SET balance = balance - 100 WHERE account_id = 10;
    
     -- Transaction 2: Read the committed data
     SET TRANSACTION ISOLATION LEVEL READ COMMITTED;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; -- Sees only committed changes
     -- Output: 250.00 ( original committed data)
     COMMIT;
    
     -- Transaction 1: Commit the update
     COMMIT;
    
     -- Transaction 2: Read the now committed data
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10;
     -- Output: 150.00 (committed data after Transaction 1)
     COMMIT;
    
  3. Repeatable Read: Ensures that if a transaction reads a value, it will see the same value if it reads again. This isolation level ensures consistent data within a transaction and provides higher level of data consistency compared to Read Committed. It prevents non-repeatable reads but allows phantom reads. This level also engages in more locking than Read Committed, potentially reducing performance.

    Scenario: Account balance is €250. A transaction reads the same data multiple times and gets the same result each time, preventing non-repeatable reads.

     -- Transaction 1: Start and read data
     SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; -- Reads initial balance
     -- Output: 250.00
    
     -- Transaction 1: Update balance but do not commit yet
     UPDATE accounts SET balance = balance - 100 WHERE account_id = 10;
     -- The balance in the database is now 150, but Transaction 1 has not committed yet
    
     -- Transaction 2: Start and read data
     SET TRANSACTION ISOLATION LEVEL REPEATABLE READ;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; -- Should see the initial balance
     -- Output: 250.00 (Repeatable Read ensures this)
    
     -- Transaction 1: Commit the update
     COMMIT;
    
     -- Transaction 2: Read data again within the same transaction
     SELECT balance FROM accounts WHERE account_id = 10;
     -- Output: 250.00 (Repeatable Read ensures this)
    
     -- Transaction 2: Commit to end the transaction
     COMMIT;
    
     -- Transaction 2: Start a new transaction and read data again
     START TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; 
     -- Output: 150.00 (Reflects the committed update from Transaction 1)
     COMMIT;
    
  4. Serializable: Transactions are completely isolated from each other, ensuring full consistency. This isolation level prevents dirty reads, non-repeatable reads, and phantom reads. However, it is the slowest isolation level due to extensive locking and blocking, and it has the highest resource consumption, leading to potential bottlenecks.

    Scenario: Account balance is €250. A transaction is fully isolated from the other ensuring the highest level of isolation, preventing phantom reads and ensuring data consistency across transactions.

     -- Transaction 1
     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 1; -- Reads initial balance
     UPDATE accounts SET balance = balance - 100 WHERE account_id = 10;
     -- Output: 150
    
     -- Transaction 2
     SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
     BEGIN TRANSACTION;
     SELECT balance FROM accounts WHERE account_id = 10; -- Waits until Transaction 1 completes
    
     -- Transaction 1: Commit the transaction
     COMMIT;
    
     -- Transaction 2: Now it can proceed
     -- Output: 150
     COMMIT;
    

Durability

Durability guarantees that once a transaction has been committed, its changes are permanently recorded in the database, even in the event of system failures.This means that the changes made by the transaction are saved to a non-volatile storage, such as HDDs or SSDs.

Mechanisms Ensuring Durability

Many systems guarantee durability through mechanisms such as:

  • Write-Ahead Logging (WAL): Before applying them to the database, the system uses a type of log known as the write-ahead log to log changes. This log ensures data consistency and prevents corruption during crashes. However, write-ahead logging (WAL) requires additional storage for logs and can introduce performance overhead.

  • Checkpointing: The system provides recovery points by saving database states to disk periodically. This reduces the recovery time after a crash, but it can consume significant storage space and may lead to data loss between snapshots.

  • Redundant Storage: The system protects against hardware failures by duplicating data across multiple storage devices. Techniques such as replication, regular backups, and RAID 1 (mirroring) are employed to ensure data is not lost even if a storage device fails. However, managing multiple copies increases complexity and can lead to consistency issues if not properly synchronised.

Scenario

Assume you have an Accounts table and you want to transfer money between two accounts. The transaction should ensure that the transfer is logged, and the changes are committed so that they persist even if the system crashes.

-- AccountID 1 = 500.00
-- AccountID 2 = 200.00

-- Start the transaction
BEGIN TRANSACTION;

-- Assume you want to transfer 100 from AccountID 1 to AccountID 2

-- Step 1: Log the transaction
INSERT INTO TransactionsLog (FromAccountID, ToAccountID, Amount) VALUES (1, 2, 100.00);

-- Step 2: Update the balance of AccountID 1
UPDATE Accounts
SET Balance = Balance - 100.00
WHERE AccountID = 1;

-- Step 3: Update the balance of AccountID 2
UPDATE Accounts
SET Balance = Balance + 100.00
WHERE AccountID = 2;

-- Commit the transaction
COMMIT;

-- AccountID 1 = 400.00
-- AccountID 1 = 300.00

Distributed Transactions

Distributed transactions involve multiple databases that are geographically distributed across a network. Unlike single-database transactions, distributed transactions require coordination to ensure that all participants either commit or roll back as a single atomic operation. As you can imagine, this adds complexity due to challenges related to resource failures, coordination, and network issues, making it difficult to achieve ACID guarantees.

Atomic Commit

The atomic commit problem involves ensuring that a distributed transaction is either committed (completed on all nodes) or aborted (undone on all nodes) even in the presence of failures. All nodes need to reach an agreement on whether or not to commit or abort a transaction. Atomic Commit employs protocols like as Two-Phase Commit (2PC), or Three-Phase Commit (3PC) to coordinate the participants.

Atomic Commit vs Consensus

Atomic commit and consensus are related concepts in distributed systems, but they serve different purposes.

Atomic CommitConsensus
PurposeThe goal of atomic commit is to ensure transaction completeness. That is, a distributed transaction is either fully committed or aborted across all participants.The goal of consensus is to achieve agreement among distributed participants on a specific value or decision.
ProcessEvery node must vote whether to commit or abort.One or more nodes propose a value.
OutcomeHere, the transaction must commit if all the nodes vote to commit and must abort if any node votes to abort.Here, any one of the proposed values is chosen.
Failure HandlingIf a participating node crashes, the transaction must abort.Consensus algorithms like Raft can tolerate crashes as long as a quorum of nodes is working.
ProtocolsTwo-phase commit (2PC) and three-phase commit (3PC) are commonly used to achieve atomic commitment.Paxos and Raft are commonly used to reach a consensus.
FocusAtomic Commit focuses on transaction integrity and consistency in distributed transactions.Consensus focuses on maintaining consistent state and fault tolerance in distributed systems.

Conclusion

In this article, we covered the basics and differences between transactions and distributed transactions. We explored the ACID properties, the challenges of distributed transactions, and the differences between atomic commit and consensus.

This understanding lays the groundwork for our further exploration into protocols which are used to address the atomic commitment problem, such as Two-Phase Commit (2PC) and Fault-tolerant Two-Phase Commit (FT-2PC), Three-Phase Commit (3PC), Saga, Event Sourcing, Google's Spanner and Percolator.

I hope you are as excited as I am for what's to come!

References

  1. Distributed Systems Lecture Series by Martin Kleppmann

  2. Designing Data-Intensive Applications

0
Subscribe to my newsletter

Read articles from Chidinma directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Chidinma
Chidinma

Software engineer in the Netherlands. I am committed to continuous learning and sharing knowledge. I write about topics related to data structures -•- algorithms -•- distributed systems -•- software architecture -•- and more.