What is a Transaction?

A transaction is a series of actions that must all be executed successfully. If any one of the actions fails, the entire set of operations is rolled back, leaving the system in its previous stable state. A transaction has the following ACID properties.

1. Atomicity

Scenario: A customer places an order.
Actions:
- The Order Service deducts the product quantity from the Inventory Service.
- The Payment Service processes the payment.
Atomicity Ensures: If the payment fails (e.g., due to a network issue) but the inventory was already deducted, the entire transaction is rolled back — meaning no inventory is deducted and no payment is processed. All steps succeed or none do.

2. Consistency

Scenario: Maintaining accurate inventory levels.
Actions:
- The Order Service deducts the quantity from inventory.
Consistency Ensures: The system transitions from one valid state to another. If the deduction results in an invalid state (e.g., negative stock), the transaction is rolled back to maintain correct and valid data.

3. Isolation

Scenario: Multiple customers place orders at the same time.
Actions:
- Customer A and Customer B both try to buy the same product.
Isolation Ensures: Each transaction is processed as if it’s the only one. Customer A doesn’t see any partial updates made by Customer B and vice versa. This prevents data anomalies like double deduction or over-selling.

`Real-life Example:`

Imagine two customers (A and B) are buying the last item of a product at the same time.

Customer A's transaction: Checks inventory → sees 1 item → proceeds to buy.
Customer B's transaction: Also checks inventory → also sees 1 item → proceeds to buy.

➡️ Without Isolation, both may end up buying it. This leads to inconsistent or incorrect data (overselling).

➡️ With Isolation, only one customer will complete the purchase, and the other will see that the product is out of stock.

4. Durability

Scenario: Successful completion of inventory deduction and payment.
Actions:
- Inventory is updated.
- Payment is confirmed.
Durability Ensures: Once the transaction is committed, the changes are permanently saved, even in the case of a system crash or power failure. The data will be recoverable and accurate.

Transactions in Monolith vs Microservices

Monolithic Architecture

A monolithic application is a single, unified application that connects to one centralized database.
It strictly follows ACID properties (Atomicity, Consistency, Isolation, Durability).
The transaction boundary typically starts inside the service layer of the application.
If any part of the transaction fails (e.g., inventory deduction or payment fails), the entire transaction is rolled back, keeping the system in a consistent state.

Example – In a Monolith:

User places an order:

Product is added to cart
Order is created
Inventory is updated
Payment is processed

All these actions:

Are handled in a single service method
Use the same database
Are committed/rolled back as one unit of work

✅ Easy to maintain transactional integrity using standard relational database mechanisms.

Microservices Architecture

In Microservices, each service is responsible for a specific business function (following SRP – Single Responsibility Principle).
Each service has its own separate database — meaning no shared database across services.
Direct access to another service's database is strictly avoided.

Example – In Microservices:

User places an order:

Order Service: Creates the order
Inventory Service: Deducts product stock
Payment Service: Handles payment

Each of these:

Runs in separate services
Uses separate databases
Communicates via REST APIs or messaging systems (like Kafka)

❌ Problem: You can’t wrap all these operations in a single transaction. If payment fails after inventory is deducted, you cannot roll back the inventory automatically.

Patterns for Distributed Transaction Management

Synchronous Patterns

Two-Phase Commit
Three Phase Commit

2. Asynchronous Pattern

Orchestration-Based Saga Pattern
Choreography-Based Saga Pattern

Two-Phase Commit (2 PC):

Two-Phase Commit (2PC) is a protocol used to manage distributed transactions — where multiple microservices (each with its own database) need to either all commit or all rollback as one single unit.

It works in two stages:

Prepare Phase
Commit/Abort Phase

There’s also a special role:

The Coordinator manages the transaction —> Order service
The other services are called Participants —> Payment and Inventory service

`Step-by-Step Flow`

1. Transaction Initiation (Coordinator)

A user places an order.
The Order Service starts the transaction and acts as the Coordinator.

2. Prepare Phase (Voting)

The Order Service sends a "prepare" message to:
- Payment Service
- Inventory Service
Each participant checks:
- Can I do my part? (e.g., is stock available? is payment possible?)
They respond:
- "Yes" → ready to commit
- "No" → can’t commit

3. Decision Phase (Commit or Abort)

If all participants say "Yes":
- Order Service sends "commit" message to everyone.
If any participant says "No":
- Order Service sends "abort" message to cancel everything.

4. Commit Phase

If commit:
- Order Service saves the order
- Payment Service charges the user
- Inventory Service reduces stock
If abort:
- All services rollback or skip their changes

5. Acknowledgment Phase

Each service sends a final "done/ack" message back to the Order Service after completing their part.

6. Completion

If all acknowledgments are received:
- The transaction is marked successful
- The client is notified ✅
If any service fails or doesn’t respond:
- The coordinator may take compensating actions (e.g., refund, restock) or return a failure ❌ to the client.

Drawbacks of Two-Phase Commit (2PC):

Too Much Coordination
In 2PC, the coordinator (like Order Service) has to talk to every service (like Payment, Inventory) many times.
This back-and-forth messaging makes the system slow and can affect performance, especially when there are many users.
One Service Controls Everything
The coordinator is the only one managing the whole process.
If the coordinator suddenly crashes, and other services already made their changes, they won’t know whether to finish or cancel.
This can leave data in a confused or inconsistent state.
All Services Must Wait
When one transaction is happening, all the involved services have to pause other tasks.
They wait until the final decision (commit or abort) is made.
If one service is slow or has a problem, then everyone waits, which reduces performance.
Problem if Coordinator Fails at the End
If the coordinator tells some services to commit, but crashes before others receive the message, those other services won’t know what to do.
They may wait a long time.
Some systems allow them to guess and abort if there’s no reply for some time — but this is not always safe.

Because of these issues, a better version called Three-Phase Commit (3PC) was designed to avoid some of these problems, especially the blocking and confusion when the coordinator fails.

Common Failure Scenarios in 2PC :

S No	Failure Scenario	What Happens
1	Participant fails during Prepare Phase	Sends "Yes", then crashes → Coordinator doesn’t get reply → Coordinator may abort the transaction
2	Participant fails during Commit Phase	Commits locally, but crashes before sending “ack” → Coordinator thinks something went wrong
3	Coordinator fails after sending Commit message	Some services get the message, some don’t → Data becomes inconsistent
4	Network issues	Messages are delayed or lost → Coordinator or participant times out → May trigger abort
5	Mixed votes from participants	Some say "Yes", some say "No" → Coordinator aborts the transaction
6	Coordinator fails after receiving all "Yes" votes	Coordinator crashes before sending "commit" → Participants wait without knowing what to do
7	Indefinite blocking	Services wait forever for commit/abort/ack → Stuck in unknown state
8	Timeouts or unresponsiveness	Coordinator or participant doesn't reply in time → Others abort using timeout logic

When Should You Use Two-Phase Commit (2PC)?

When You Need Strong Consistency
If your system involves many services and it's very important that all data stays accurate and in sync, 2PC is a good option.
Example: updating multiple databases together without allowing mismatch.
For Critical Business Operations
If the transaction is very important and failure could cause major problems (like lost orders or wrong records), 2PC helps ensure either everything succeeds or nothing does.
For Financial Transactions
In banking, payments, or money transfers — data must be 100% correct.
2PC helps make sure that no money is deducted unless everything else goes right.

3 Phase Commit:

Why 3 PC was introduced?

In 2PC, once participants send a "Yes" vote, they wait for the coordinator's final decision (commit or abort). If the coordinator crashes at this point, the participants remain blocked — they don’t know whether to proceed or rollback. This makes 2PC a blocking protocol.
To fix this, 3PC (Three-Phase Commit) was introduced. It adds an extra step called the Pre-commit phase between the prepare and commit phases. This phase helps avoid indefinite blocking.
If the coordinator crashes before the pre-commit is sent, participants assume the transaction is aborted. If the coordinator crashes after sending pre-commit but before final commit, participants already know the commit is likely and can decide on their own after a timeout.
So, 3PC improves reliability by allowing participants to make safe decisions when the coordinator fails, reducing the chance of being stuck forever.

Three-Phase Commit (3PC) is an extension of Two-Phase Commit (2PC) that adds an extra step to avoid blocking issues. In 3PC, the commit phase is split into two steps, making the process more reliable in case of failures.

The Two Phases Added in 3PC (after voting):

Prepare to Commit (Pre-Commit Phase):
After all participants say “Yes” in the voting phase, the coordinator sends a message asking them to prepare to commit.
At this point, participants acquire locks and get ready, but don’t commit yet.
Do Commit:
If the coordinator receives confirmation from all participants that they are prepared, it then tells them to commit the transaction.

What happens if the coordinator fails during commit phase?

A secondary coordinator takes over the process.
It asks all participants about their current status.
If it finds that some participants are already in the commit phase, it assumes that the original coordinator had already decided to commit — so it tells all participants to commit.
But if no one received the “prepare to commit” message, the new coordinator assumes that the original coordinator failed before making any decision. In that case, it is safe to abort the transaction, and it tells everyone to roll back.

Drawbacks of Three-Phase Commit (3PC)

Even though 3PC solves some problems of 2PC, it has its own issues.

1. Increased Complexity

3PC adds an extra phase called the Pre-commit phase.
This makes the protocol harder to build and understand.

2. 🐢 More Latency (Slower)

Because of the extra phase, transactions take more time.
All services must wait until the slowest service responds.
During this time, resources stay locked.

3. 🔒 Blocking During Recovery

3PC reduces blocking, but not fully.
In some cases, the system still waits for replies from all services during recovery.

4. 🧠 Still Centralized

Like 2PC, 3PC needs a central coordinator.
So it is not fully decentralized.

5. 📨 More Messages = More Network Load

3PC has more steps → more messages between services.
This increases communication overhead.

6. ❌ Can't Handle Permanent Failures Well

3PC assumes that services will fail and then recover.
If a service fails and never comes back (fail-stop), 3PC doesn't handle it properly.

7. 🔄 Depends on Synchronous Communication

3PC uses synchronous communication (waits for replies).
This adds predictability, but also latency.
In modern systems, asynchronous methods are often preferred.

✅ What’s the Alternative?

👉 Use Saga Pattern (Async, Eventual Consistency)

Saga does not need distributed ACID transactions.
Instead, it accepts eventual consistency.
It uses compensating actions to undo work if needed.

🔁 When to Use Sync vs Async?

✅ Use Synchronous (like 2PC) for Financial Transactions

Example: Transfer money between accounts.
Requires strong consistency.
It’s okay to block if needed to ensure data is 100% correct.
2PC makes sure both debit and credit happen together or not at all.

✅ Use Asynchronous (like Saga) for E-commerce Workflows

Example: Place an order → Payment → Inventory → Shipping.
Services work independently.
If payment fails, the system can cancel the order or restock items.
Saga allows partial rollback using compensating transactions.
Good for loose coupling and flexibility.

Transaction Management(2PC & 3PC commit) -- Microservices