SQL Transactions and ACID properties; simplified like never before.
Table of contents
- Understand What Transactions are with Cars!
- Without Transactions: A World of Chaos
- Advantages: Why You Need Them
- Disadvantages: The Price You Got To Pay
- States of Transactions
- ACID Properties
- Practical Guide
- Step 1: Open You SQL Editor
- Step 2: Create Database
- Step 3: Create Tables and Populate with Sample Data
- Step 4: Understand how a Basic and Successful Transaction Works
- Step 5: Understand how a Basic and Failed Transaction is Rolled Back Entirely
- Step 6: Understand how Rollback to specific Savepoints within a Transaction.
- Time to Wrap Up!
Understand What Transactions are with Cars!
A Transaction in SQL is group of single task / multiple tasks that is / are labelled as single unit of execution / operation. Each transaction begins with a specific task and ends when all the tasks in the group successfully completed.
Imagine Lightning McQueen, the race car, participating in the Piston Cup (Racing Event in the Cars franchise).
Here, the Transaction for McQueen is participating in the race.
The tasks for McQueen in the participation would be fuelling the tank up, checking tires, tuning the engine, checking brakes, regulating speed, cornering on sharp turns, etc.
McQueen could encounter obstacles like empty fuel tank, flat tires, clash with other cars, breaking rules.
The outcome of the transaction could be
either McQueen completes all tasks and finishes the race
or fails to start or completes the race if any task is incomplete or encounters obstacles.
Without Transactions: A World of Chaos
Imagine a banking application that processes money transfers.
If one part of the transfer fails — say, the deduction is successful but the addition fails — this could lead to missing money, double charges, or even orphaned records.
You send 1000 rupees to your friend to fuel up his Car. The money deducts from your account and never credits into his.
How cool is that? Absolutely not! The following are some real consequences if we didn’t have transactions:
Data Inconsistency: Every partial update could leave the database in an inconsistent state. Imagine booking a hotel room; if payment goes through but room allocation fails, you'd lose money but not get the room.
Risk of Cascading Failures: For complex operations, a failure in one task could cause unexpected errors in later tasks. For example, if inventory management updates are interrupted, it could result in duplicate or missing records.
Increased Manual Intervention: Businesses would need to monitor and fix incomplete operations manually, which increases overhead and is prone to human error.
System Integrity Risks: Without transactions, data reliability is reduced, leading to a lack of trust in the system.
Advantages: Why You Need Them
Data Consistency and Integrity:
SQL transactions maintain consistency by ensuring that only valid data is committed to the database.
Using transactions, any operation that fails mid-execution can roll back to its previous state, preserving data accuracy and integrity.
For example, in a bank transfer, both the debit and credit parts of the transaction are required to complete successfully. If any part fails, the whole transaction is rolled back, preventing partial updates.
Error Handling and Recovery:
Transactions make error handling straightforward.
If an error occurs during any part of a transaction, the system can roll back to the original state before the transaction started. This recovery mechanism minimizes the impact of faults.
Savepoints, often used within transactions, allow rollback to specific points, giving more flexibility in handling complex sequences of operations.
Concurrent Access Management:
Transactions help manage concurrent access to data, which is crucial in multi-user environments.
By locking resources temporarily, transactions prevent "dirty reads" (reading uncommitted data) and ensure reliable data.
For example, in an e-commerce platform, two users shouldn’t be able to buy the last item in stock simultaneously, which transactions handle effectively.
Isolation Levels and Control:
SQL transactions support isolation levels (READ UNCOMMITTED, READ COMMITTED, REPEATABLE READ, and SERIALIZABLE) that allow developers to choose the level of data visibility among concurrent transactions based on performance and integrity needs.
Higher isolation levels prevent issues like phantom reads or non-repeatable reads, although they may come with performance trade-offs.
Enhanced Security:
Transactions add a layer of security by ensuring that data is either fully updated or fully restored, making it easier to track changes and prevent partial updates.
For instance, a banking system can implement transactions to ensure no data is compromised during account modifications or financial operations.
Disadvantages: The Price You Got To Pay
Performance Overheads:
Transactions, especially those involving locking mechanisms and high isolation levels, can slow down database performance. Locks prevent other users from accessing data until a transaction completes, which can reduce throughput in high-traffic applications.
In real-time systems or high-frequency trading, transaction overhead can become significant and require careful management.
Complexity in Multi-Stage Transactions:
Long-running, multi-stage transactions are complex to manage, especially if they span across multiple services or databases. Coordination between multiple databases or systems can be challenging and often requires advanced techniques, like distributed transactions.
For example, processing orders that require checks across multiple services (such as inventory and billing) can become complicated due to the need for two-phase commit or other mechanisms to manage cross-database transactions.
Deadlocks and Blocking Issues:
Transactions, especially when they involve locks on shared resources, can result in deadlocks if multiple transactions are waiting on each other’s locks to release. Deadlocks need to be detected and resolved, often by rolling back one of the transactions.
Blocking can also occur, where one transaction holds a lock on a resource that another transaction is waiting to access, leading to delays.
Increased Resource Consumption:
Transactions can consume more resources, such as CPU and memory, as they hold information about the initial state and lock resources for rollback purposes. Long transactions can tie up resources for extended periods, especially if they fail to commit or roll back promptly.
Resource management becomes critical, especially in databases where many concurrent transactions are active.
Complexity in Error Handling and Debugging:
Although transactions offer error-handling advantages, troubleshooting can still be complex. In large applications with nested transactions and savepoints, identifying the source of a rollback or an error might be difficult.
Tracking down the sequence of events in complex transactions to find the error point can be time-consuming without detailed logging and monitoring.
Potential for Data Contention in High-Volume Environments:
In high-transaction environments like large e-commerce sites, the need to access the same resources often leads to data contention. While transactions prevent conflicts, they may also require advanced tuning to balance between performance and consistency.
For example, if hundreds of users attempt to buy the same item simultaneously, managing contention becomes challenging and may slow down performance.
States of Transactions
In SQL, a transaction goes through several key states from start to completion, each representing a phase in the transaction's lifecycle. Here's a breakdown of these states:
1. Active State
The transaction begins in the active state as soon as the first SQL command (like BEGIN TRANSACTION
) is issued.
This state represents that the transaction is currently in progress and actively performing operations. All operations, such as updates, inserts, or deletes, occur in this state.
The transaction can continue to execute until a condition (like an error or explicit command) changes its state.
2. Partially Committed State
After the transaction issues a COMMIT
command, it enters the partially committed state.
Here, the transaction has finished executing all of its operations, but the database has not yet applied the changes permanently. This state serves as an intermediate checkpoint, where the system prepares to make the changes permanent.
If the system crashes at this point, recovery mechanisms might still reverse the transaction.
3. Committed State
When the transaction successfully completes and all changes are made permanent, it reaches the committed state.
In this state, all changes made by the transaction are now visible to other users and persisted in the database.
The transaction is now fully complete and cannot be undone, except through another separate transaction.
4. Aborted / Failed State
If any error or issue occurs during the active or partially committed state (e.g., a constraint violation, hardware failure, or system error), the transaction moves to the failed/aborted state.
In this state, the system undoes all the operations performed in the transaction, rolling back to the initial database state before the transaction began.
Once rolled back, the transaction ends in a permanently aborted state, ensuring that no partial or incorrect data persists.
5. Terminated State
Once a transaction has reached either the committed or failed/aborted state, it finally enters the terminated state.
This final state signifies that the transaction is completely finished, with no further actions possible.
All resources used by the transaction (locks, memory, etc.) are released, and the transaction concludes fully.
ACID Properties
The ACID properties in SQL transactions ensure that database transactions are processed reliably and guarantee the integrity of the database even in the event of errors, system failures, or unexpected events.
They’re essential for the following reasons:
Ensuring Data Integrity: By adhering to ACID, the database ensures that it can consistently hold valid data, even in cases of failure or concurrent access.
Handling Errors and Failures Gracefully: ACID allows the database to recover from errors in a way that prevents data corruption.
Supporting Concurrent Transactions: With isolation, the database ensures that concurrent transactions don’t interfere with each other, preventing conflicts and anomalies.
Providing Reliability: Durability guarantees that committed transactions are permanent, ensuring reliable data storage.
ACID is an acronym that stands for the following four properties:
1. Atomicity
Definition:
Atomicity means that a transaction is treated as a single, indivisible unit.
The transaction either completes entirely (i.e., all operations within the transaction are applied), or it has no effect at all (i.e., none of the operations are applied).
In other words, even if a transaction contains multiple steps, they are all completed successfully or none are.
Purpose:
The purpose of atomicity is to ensure that partial changes to the database do not persist.
If any operation within a transaction fails, the database will be rolled back to its state before the transaction began, leaving no incomplete or inconsistent data.
Example:
Consider a bank transfer where $100 is deducted from Account A and added to Account B.
If one of the operations fails, neither the deduction from Account A nor the addition to Account B will be saved, preventing any inconsistencies.
2. Consistency
Definition:
Consistency ensures that a transaction brings the database from one valid state to another.
A transaction must take the database from a consistent state to another consistent state, following all rules, constraints, and triggers.
This means that the database should not be left in an invalid state after the transaction.
Purpose:
- The purpose of consistency is to maintain the integrity of the database by ensuring that any changes made by a transaction respect all predefined constraints (such as primary keys, foreign keys, unique constraints, etc.), triggers, and other rules enforced by the database system.
Example:
If a transaction adds a row to a table that enforces a
NOT NULL
constraint on a column, the transaction will not be allowed to proceed unless the required data is present.If the transaction violates the constraint, it will be rolled back, ensuring consistency.
3. Isolation
Definition:
Isolation ensures that the operations of one transaction are isolated from the operations of other concurrent transactions.
Even if multiple transactions are happening simultaneously, each transaction will be unaware of the others, preventing data conflicts and anomalies.
Purpose:
- The purpose of isolation is to avoid "dirty reads" (where a transaction reads uncommitted data from another transaction), "non-repeatable reads" (where a value read by a transaction changes during its execution), and "phantom reads" (where new rows are added or removed by another transaction during the execution of a query).
Example:
- If two users are transferring money from their accounts at the same time, isolation ensures that one transaction is completed before the other starts, preventing both from interacting with the same account balance at the same time.
Isolation Levels: The isolation property can be controlled using different isolation levels:
Read Uncommitted: Transactions may read uncommitted changes made by other transactions.
* Read Committed: Transactions can only read committed data; they cannot read data that is still being modified by other transactions.
* Repeatable Read: Ensures that if a transaction reads data, it can read the same data again without it being modified by another transaction.
* Serializable: The highest level of isolation, where transactions are executed in such a way that the results are equivalent to running them serially, one after the other.
[ May be a separate Blog on this topic alone ]
4. Durability
Definition:
- Durability ensures that once a transaction has been committed, its changes are permanent, even in the event of a system crash, power failure, or hardware malfunction.
Purpose:
- The purpose of durability is to guarantee that once a transaction is committed, the changes are securely stored in the database and will not be lost, regardless of any failures that occur afterward.
Example:
- If a transaction successfully completes and the system crashes before it can be saved to disk, the changes will still persist upon system recovery, and the transaction will not be lost.
Practical Guide
This is enough for you to learn about SQL and fire up your PostgreSQL database or MySQL database to get you hands on experience on to better understand them.
I prefer PostgreSQL Database, because, why not? It’s the most advanced open-source database that’s also complies with the standards SQL:2003 and beyond.
Step 1: Open You SQL Editor
Open pgAdmin4 for PostgreSQL Database and MySQL Workbench for MySQL. [ DBeaver works for both, if you prefer that as well. ]
Step 2: Create Database
You’ll have to create the using the GUI in the PostgreSQL.
But if you’re fan of queries, in the MySQL Workbench, you could use the following queries to get started.
-- Creating Database
CREATE DATABASE db_study_transactions;
-- Switches the current database context to db_sample, meaning that any subsequent SQL queries will be executed within the context of the db_sample database.
USE db_study_transactions;
Step 3: Create Tables and Populate with Sample Data
-- Create Table tbl_bank_accounts
CREATE TABLE tbl_bank_accounts(
account_id integer PRIMARY KEY,
account_holder character varying,
balance_amount numeric NOT NULL
);
-- Will show Nothing :(
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
-- Add 2 records in tbl_bank_accounts
INSERT INTO tbl_bank_accounts
VALUES (1, 'Lightning McQueen', 10000),
(2, 'Jackson Storm', 5000);
-- Will show 2 perviously inserted 2 records :)
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
file_name: setup.sql
Creates a table
tbl_bank_accounts
to and enforces necessary contraints that will help us understanding the SQL transactions much better.Insert 2 records as sample data.
Step 4: Understand how a Basic and Successful Transaction Works
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
-- [BASIC] TRANSFER 100 FROM MCQUEEN TO JACKSON
-- [SUCCESSFUL TRANSACTION]
BEGIN TRANSACTION;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Credit to Storm
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 2;
COMMIT;
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
ROLLBACK;
file_name: basic_transaction_1.sql
Demonstrates a Successful Transaction.
The following is the how the Transaction works in Action
Transaction Begins
Updates balance_amount of Lightning McQueen by deducting 100.
Updates balance_amount of Jackson Storm by deducting 100.
Commits Transaction.
COMMIT;
statements marks the successful end of the transaction and hence, the Transactions goes from Partially Committed to Committed to Terminated state.
Step 5: Understand how a Basic and Failed Transaction is Rolled Back Entirely
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
-- [BASIC] TRANSFER 100 FROM MCQUEEN TO HUDSON
-- [FAILED TRANSACTION]
-- BY INTRODUCING "PRIMARY KEY CONSTRAINT VIOLATION"
BEGIN TRANSACTION;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Introducing "PRIMARY KEY CONSTRAINT VIOLATION"
INSERT INTO tbl_bank_accounts
VALUES (2, 'Doc Hudson', 3300);
ROLLBACK;
-- Credit to Hudson
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 3;
COMMIT;
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
file_name: basic_transaction_2.sql
PART 1: Transaction that introduced error causing intentional termination.
The following is the how the Transaction works in Action
Transaction Begins
Updates balance_amount of Lightning McQueen by deducting 100.
Inserts a new record with the account_id = 2;
This although the
account_holder
andbalance_amount
are different, it invokes an error that goes along like "PRIMARY KEY CONSTRAINT VIOLATION".It means that a record cannot me inserted into the table, if there already exists an record with the specified Primary Key, i.e.,
account_id
.
After encountering this error, the Transaction goes from Active to Aborted state, but it is still not Terminated.
You’ll have to Terminate it, manually. Then it goes from Aborted to Terminated state.
-- [BASIC] TRANSFER 100 FROM MCQUEEN TO HUDSON
-- [SUCCESSFUL TRANSACTION]
BEGIN TRANSACTION;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Correcting the "PRIMARY KEY CONSTRAINT VIOLATION"
INSERT INTO tbl_bank_accounts
VALUES (3, 'Doc Hudson', 3300);
-- Credit to Hudson
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 3;
COMMIT;
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
file_name: basic_transaction_2.sql
PART 2: Correcting the above Error.
When inserting the record for 'Doc Hudson', replace the account_id
from 2 to 3.
The transaction would then execute successfully as follows:
Transaction Begins
Updates
balance_amount
of Lightning McQueen by deducting 100.Inserts a new record with the
account_id = 3
,account_holder = ‘Doc Hudson’
andbalance_amount = 3300
;Updates
balance_amount
of Lightning Hudson by crediting 100.The Transaction then goes from Partially Committed to Committed to Terminated state, after the
COMMIT
has been executed.
Step 6: Understand how Rollback to specific Savepoints within a Transaction.
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
-- [BASIC] TRANSFER 100 FROM MCQUEEN TO HUDSON
-- [FAILED TRANSACTION]
-- BY INTRODUCING "NOT-NULL KEY CONSTRAINT VIOLATION"
BEGIN TRANSACTION;
-- create savepoint
SAVEPOINT initial_savepoint;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Credit to Storm
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 2;
-- create savepoint
SAVEPOINT credited_to_jackson_storm;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Credit to Doc Hudson
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 3;
-- Create Savepoint
SAVEPOINT credited_to_doc_hudson;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Introducing "NOT-NULL KEY CONSTRAINT VILATION"
INSERT INTO tbl_bank_accounts
VALUES (4, 'Franchesco Bernoulli', NULL);
ROLLBACK TO SAVEPOINT credited_to_doc_hudson;
-- Credit to Bernoulli
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 4;
-- Create Savepoint
SAVEPOINT credited_to_franchesco_bernoulli;
COMMIT;
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
file_name: intermediate_transaction_1.sql
PART 1: Transaction that introduced error causing intentional termination.
In the above we’re trying to multiple money transfers into 1 single transaction. The above can be summarised as follows:
McQueen sends 100 to Jackson
- create a savepoint
credited_to_jackson_storm
- create a savepoint
McQueen sends 100 to Hudson
- create a savepoint
credited_to_doc_hudson
- create a savepoint
Add the record for Bernoulli with balance_amount = NULL.
This will invoke an error that goes along like "NOT-NULL CONSTRAINT VIOLATION".
It means that a column
balance_account
that has NOT-NULL CONSTRAINT enforced does not let the query to get executed. 0 which is a perfectly valid value forbalance_account
is acceptable, but NOT the NULL value.
Again transaction goes from Active to Aborted state. But it’s still not Terminated.
This instead of rolling back the entire transaction [ reverting the 2 successful bank transfers ], we’ll rollback to the credited_to_doc_hudson savepoint.
By doing so,
the 100 deducted from McQueen during the attempt of bank transfer to Bernoulli are recovered.
but the effect of 2 successful transactions [ to Jackson and Hudson ] are preserved.
BEGIN TRANSACTION;
-- create savepoint
SAVEPOINT initial_savepoint;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Credit to Storm
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 2;
-- create savepoint
SAVEPOINT credited_to_jackson_storm;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Credit to Doc Hudson
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 3;
-- Create Savepoint
SAVEPOINT credited_to_doc_hudson;
-- Deduct from McQueen
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount - 100
WHERE account_id = 1;
-- Correcting the "NOT-NULL KEY CONSTRAINT VILATION"
INSERT INTO tbl_bank_accounts
VALUES (4, 'Franchesco Bernoulli', 1100);
ROLLBACK TO SAVEPOINT credited_to_doc_hudson;
-- Credit to Bernoulli
UPDATE tbl_bank_accounts
SET balance_amount = balance_amount + 100
WHERE account_id = 4;
-- Create Savepoint
SAVEPOINT credited_to_franchesco_bernoulli;
COMMIT;
SELECT * FROM tbl_bank_accounts ORDER BY account_id;
file_name: intermediate_transaction_1.sql
PART 2: Correcting the above Error.
When inserting the record for 'Franchesco Bernoulli', replace the account_id
from NULL to any NON-NULL value, i.e., 1100 in our case.
The transaction would then execute successfully as follows:
Transaction Begins
Updates
balance_amount
of Lightning McQueen by deducting 100.Updates
balance_amount
of Jackson Storm by Crediting 100.Creates Savepoint,
credited_to_jackson_storm
Updates
balance_amount
of Lightning McQueen by deducting 100.Updates
balance_amount
of Doc Hudson Storm by Crediting 100.Creates Savepoint,
credited_to_doc_hudson
Updates
balance_amount
of Lightning McQueen by deducting 100.Inserts a new record with the
account_id = 3
,account_holder = ‘Franchesco Bernoulli’
andbalance_amount = 1100
;Updates
balance_amount
of Franchesco Bernoulli by crediting 100.The Transaction then goes from Partially Committed to Committed to Terminated state, after the
COMMIT
has been executed.
Time to Wrap Up!
If you’ve been through this article till the end, hats off to you, Sergeant! Writing this article was the ultimate test of my patience and yours as well.
While we’re simply focused on understanding these advanced concepts theoretically OR simply practicing DML queries for interviews, we’d often forget that SQL is more than that.
This was my attempt to the simplify the complex SQL Transaction that were not un-understandable despite mugging up before exams, until we actually had implement them in our projects. Not anymore!
I’ll surely add a few diagrams, wherever necessary, to facilitate more understandability. Any sort of feedback is welcome and I wish you all the best ahead!
Connect with me on the LinkedIn and Email on:
LinkedIn: www.linkedin.com/in/shrinivasv73
Email: shrinivasv73@gmail.com
See Ya!
Subscribe to my newsletter
Read articles from Shrinivas Vishnupurikar directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Shrinivas Vishnupurikar
Shrinivas Vishnupurikar
👋🏻 Hello! My name is Shrinivas and with a strong foundation in Data Analytics and Machine Learning, I'm excited to learn from industry experts and contribute my skills. 🧑🏻🏫 As an Instructor, I've taught Python Programming Language with Project based learning approach to 140+ students thus fostering AI, Data Analytics, and ML interests. I am well-versed in Data Visualization, possess excellent communication skills, and excel in leadership and team building. 🤝 Whether it's presenting insights to technical team or non-technical stakeholders or collaborating with cross-functional teams or teaching and learning from seniors or peers, I'm committed to fostering clear and impactful communication. 🌟 If you share a passion for Data Analytics, Business Intelligence, Machine Learning, Prompt Engineering, Large Language Models, or simply want to connect, I'd love to hear from you. Get In Touch With Me :- Phone No: +91 8080687809 Email: shrinivasv73@gmail.com