Idempotency in Kafka Producer
How Producer and Broker Communication works?
Whenever a Producer send a message to Broker , Broker will in return send an ACK
to producer and log the message into the respective partition .In case producer doesn’t successfully received the ACK
producer will again send the same message to Broker
What is Problem here?(Duplicate Message)
Let’s Understand the problem in above scenario
Producer send the message to Broker, Broker append it in respective Partition
Broker sent an
ACK
to Producer but somehow it will fail.Producer will again send the same message to Broker as producer didn’t received the
ACK
from broker due to which it will lead to duplicate message to be append in the Partition
Solution (Idempotent Producer)
The Kafka assigns a unique Producer ID (PID) to each producer.
The producer maintains a mapping of TopicPartitions with their corresponding sequence numbers.
Both the PID and an increasing sequence number accompany every message sent by the producer
For every TopicPartition, the broker keeps the largest PID-Sequence Number combination that has been successfully written.
The broker rejects a message if its sequence number isn't exactly one greater than the last committed message for the corresponding PID/TopicPartition pair.
This design ensures that even though a producer might retry requests on account of failures, every message is written to the log exactly once
A new instance of a producer receives a new unique PID, which ensures idempotent production only within a single producer session.
Subscribe to my newsletter
Read articles from Kunal Arora directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by