Kafka Producer Ordering Ensured

In distributed systems, message order can make or break correctness for certain use cases—especially financial transactions, audit logs, or real-time event tracking. Kafka helps maintain order in a predictable and efficient way, but there are trade-offs and nuances every developer should understand.

Kafka’s Message Ordering Model

Kafka guarantees message ordering within a partition. When a producer sends messages in a defined order to the same partition, the broker writes them in that same order, and the consumer reads them in that same order.

Why per-partition?
Because partitions are Kafka’s unit of parallelism and storage. By scoping ordering guarantees to a partition, Kafka balances order and scalability.

When Does Order Matter?

If your use case requires reconstructing a timeline (e.g., bank debits/credits), processing out-of-order messages can lead to data corruption or inconsistencies. In such scenarios, preserving strict message order is non-negotiable.

The Catch: Retries & In-flight Requests

Kafka producers are typically configured for resilience, which includes retries in case of transient failures (like a temporarily unavailable broker). But this introduces a subtle risk:

⚠️ The Ordering Problem

If the producer has:

retries > 0
max.in.flight.requests.per.connection > 1

Then this can happen:

Producer sends Batch A and Batch B to the same partition.
Broker fails to write Batch A but successfully writes Batch B.
Producer retries Batch A and it succeeds.

🚨 Result: Batch B appears before Batch A in the partition.
➡️ Order is violated.

Ensuring Order: The Reliable Approach

To guarantee message order even with retries enabled:

props.put(ProducerConfig.RETRIES_CONFIG, 5);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);

This ensures:

If a message batch fails and is retried,
No new messages are sent in the meantime,
Thus, order is preserved.

✅ Correctness over Throughput

The Trade-off: Throughput Penalty

Limiting max.in.flight.requests.per.connection to 1 significantly reduces parallelism. For high-throughput workloads where order is not crucial, this may not be desirable.

🎯 Guideline: Only enforce strict ordering when business logic absolutely requires it. Otherwise, tune for throughput.

Summary

Kafka’s partition-level ordering is strong and reliable. But when producers introduce retries and allow concurrent in-flight requests, order violations can occur. You can prevent this by trading off throughput and enforcing a one-at-a-time message delivery approach.

✅ Set retries > 0

✅ Set max.in.flight.requests.per.connection = 1

📉 Expect lower throughput

🎯 But guaranteed order.

Order or throughput: the choice depends on what your system values most.

Ordering Guarantees in Apache Kafka Producers