One-Phase Commit Transaction Strategy
Introduction
A commonly adopted transaction strategy can be described as the best-efforts one-phase-commit (1PC) pattern. It's different from a global XA/2PC protocol where an external transaction manager ensures that all transaction properties are maintained across the involved transactional resources (database, queue, etc).
The basic idea behind 1PC is to delay the commit in a transaction as late as possible so that the only things that can go wrong are infrastructure failures (because they are rare). All business processing failures are caught before it happens.
It is a relaxation of ACID properties spanning multiple transactional resources. That also means there's a certain risk for system inconsistency in a worst-case scenario, which can be mitigated if the processing is idempotent. For this strategy to work as safely as possible, idempotency is key.
This concept is also described in full detail in Dr David Syer's article Distributed transactions in Spring, with and without XA from 2009.
Scenarios
To illustrate 1PC let's review a couple of examples.
Database and Message Broker
Consider a typical service activator scenario where there's a database and a JMS message queue involved. It includes a write to the database and an acknowledgement to the broker of receiving a message. Both these operations are independent, as in there's no atomicity.
This scenario also maps to Kafka which uses consumer offsets and message retention rather than ephemeral message acks (removed once ack:ed for queue).
Start messaging transaction (broker delivers a message)
Receive message
Start database transaction
Update database
Commit database transaction
Commit messaging transaction (ack is sent to broker upon which the message is removed from destination)
The order of the first four steps is not important. What is important is that the message must be received before updating the database and each transaction must start before its corresponding resource is used.
Dual-write problem (below):
@Component
public class RegistrationConsumer {
@JmsListener(destination = "${active-mq.topic}", containerFactory = "jmsListenerContainerFactory")
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void receiveMessage(RegistrationEvent event) {
registrationRepository.save(toEntity(event));
}
}
The following order is therefore just as valid:
Start messaging transaction
Start database transaction
Receive message
Update database
Commit database transaction
Commit messaging transaction
The last two steps (5 and 6) are important to be both in order and come last. It's better to surface business processing violations (bad input, rule violations, constraint violations etc) before sending things to be made permanent. When flushing database operations before acknowledging the message, there's less chance of both systems going out of sync.
An out-of-sync condition could be that the database transaction commits, but the message broker ack fails. Or the other way around, the commit fails and the ack succeeds. In either case, it will result in double-processing the same event and if the database writes are nonidempotent you end up with multiple side effects.
In the case of object-relational-mappers (ORM), most database actions take place during the commit phase due to the first-level cache. It's at that point where the JPA provider (Hibernate) performs update optimizations (collapsing) and determines what SQL statements to send to the database. It's also the phase where the database may raise data model constraint violations to preserve integrity.
Things that can go wrong in the messaging transaction are network and process failures with the broker, which are less likely to occur.
Database and Remote API Call
Consider a typical service boundary scenario where there's a database and a foreign API service involved. It includes a write to the database and an API call to the remote endpoint, which in turn creates some side-effect. Both these operations are independent, as in there's no atomicity.
Start database transaction
Update database
Send a POST request to the remote API
Commit database transaction
First off, it's not advisable to invoke remote calls from a transaction context. Therefore, the minimum ask would be to order the steps accordingly:
Send a POST request to the remote API
Start database transaction
Update database
Commit database transaction
Better, but still there's a potential issue here if the database transaction fails. In that case, when you retry the entire operation there will be another POST request sent to the endpoint. If that endpoint is nonidempotent you may end up with multiple side effects. Essentially this is the same problem as in the first example, called non-atomic dual-writes.
@Service
public class TransferService {
@Transactional(propagation = Propagation.REQUIRES_NEW)
public void createTransfer(TransferEntity entity) {
ResponseEntity<String> response
= new RestTemplate().postForEntity("https://api.bank.com",
toRquestPayload(entity), String.class);
if (!response.getStatusCode().is2xxSuccessful()) {
throw new IllegalStateException("Disturbance!");
}
}
}
However, in this scenario, if the endpoint is idempotent (invoking many times is the same as invoking once) then it doesn't matter how many times you retry it. It will only have one single side effect.
CockroachDB and XA
Currently CockroachDB doesn't support the XA protocol but there's a tracking issue for it. The good thing is that XA-distributed transactions are not strictly needed to support the above scenarios. There are plenty of alternative options, such as:
Saga pattern:
A decomposed version of 2PC where involved services implement participation and compensation methods as part of an agreement protocol. Either using an orchestrated or choreographed approach.
The practical use is between disparate services (not between databases and/or brokers)
Quite complicated to implement and test and reduces understandability
Outbox pattern:
Domain events are written to the database as part of the local transaction.
Domain events are published downstream after the commit point using CDC
Avoids the non-atomic dual write problem.
The practical use is between disparate services
Inbox pattern:
Incoming messages are stored in the database and then CDC is used to publish or self-subscribe to the messages
Offloads the message broker and adds retention
The practical use is between disparate services
1PC with idempotency
- As described in this article
Conclusion
This article outlined a commonly adopted transaction strategy described as the best-efforts one-phase-commit (1PC) pattern. It offers a simple, low-effort alternative to XA pre-conditioned that operations are idempotent.
Subscribe to my newsletter
Read articles from Kai Niemi directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by