Ordering Guarantees in Apache Kafka Producers

Vijay BelwalVijay Belwal
2 min read

In distributed systems, message order can make or break correctness for certain use cases—especially financial transactions, audit logs, or real-time event tracking. Kafka helps maintain order in a predictable and efficient way, but there are trade-offs and nuances every developer should understand.


Kafka’s Message Ordering Model

Kafka guarantees message ordering within a partition. When a producer sends messages in a defined order to the same partition, the broker writes them in that same order, and the consumer reads them in that same order.

Why per-partition?
Because partitions are Kafka’s unit of parallelism and storage. By scoping ordering guarantees to a partition, Kafka balances order and scalability.


When Does Order Matter?

If your use case requires reconstructing a timeline (e.g., bank debits/credits), processing out-of-order messages can lead to data corruption or inconsistencies. In such scenarios, preserving strict message order is non-negotiable.


The Catch: Retries & In-flight Requests

Kafka producers are typically configured for resilience, which includes retries in case of transient failures (like a temporarily unavailable broker). But this introduces a subtle risk:

⚠️ The Ordering Problem

If the producer has:

  • retries > 0

  • max.in.flight.requests.per.connection > 1

Then this can happen:

  1. Producer sends Batch A and Batch B to the same partition.

  2. Broker fails to write Batch A but successfully writes Batch B.

  3. Producer retries Batch A and it succeeds.

🚨 Result: Batch B appears before Batch A in the partition.
➡️ Order is violated.


Ensuring Order: The Reliable Approach

To guarantee message order even with retries enabled:

props.put(ProducerConfig.RETRIES_CONFIG, 5);
props.put(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION, 1);

This ensures:

  • If a message batch fails and is retried,

  • No new messages are sent in the meantime,

  • Thus, order is preserved.

Correctness over Throughput


The Trade-off: Throughput Penalty

Limiting max.in.flight.requests.per.connection to 1 significantly reduces parallelism. For high-throughput workloads where order is not crucial, this may not be desirable.

🎯 Guideline: Only enforce strict ordering when business logic absolutely requires it. Otherwise, tune for throughput.


Summary

Kafka’s partition-level ordering is strong and reliable. But when producers introduce retries and allow concurrent in-flight requests, order violations can occur. You can prevent this by trading off throughput and enforcing a one-at-a-time message delivery approach.

✅ Set retries > 0

✅ Set max.in.flight.requests.per.connection = 1

📉 Expect lower throughput

🎯 But guaranteed order.

Order or throughput: the choice depends on what your system values most.

0
Subscribe to my newsletter

Read articles from Vijay Belwal directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Vijay Belwal
Vijay Belwal