confluent-kafka-dotnet
confluent-kafka-dotnet copied to clipboard
Semantics of batch processing and partition revoked ?
Hi,
I have a consumer that reads messages and sends them to Azure Service Bus (as there doesn't seem to be a sink Connector)
To get the throughput needed, I have to batch the messages before sending them to Service Bus.
My Kafka Consumer reads a batch of messages (50), sends them to azure, then calls StoreOffset for max offset for each tp (with autoCommit=true)
This works fine and I understand it is at least once processing.
My question is around rebalances - to try to minimize duplicate message processing can I flush the batched up messages and store offsets in the callback for the PartitionsRevokedHandler? Will offsets get correctly committed before the rebalance finishes (so the new subscribers pick up the correct offset) ?
Part of my confusion is not fully understanding the threading model of the C# consumer - are all callbacks dispatched on the same processing thread? (i.e. I won't get a partitionrevoked callback while its in the middle of sending messages to azure)
Thanks,
Matt
good questions
all the callbacks are dispatched on the same processing thread, yes. the rebalance callbacks happen as a side effect of a call to consume (or close). if you store offsets in the PartitionsRevokedHandler callback, yes, they will be committed before the consumer re-joins the group and the group rebalances.
Perfect - no work for me :)
Thanks @mhowlett
Hi,
I'm struggling with the rebalancing and deduplication too and the idea to commit offsets within the PartitionsRevokedHandler sounds just great.
So, as @mhowlett said, the current consumer would start from the committed offsets - one less to worry about :-)
But what about another consumers in that case?
Is it possible that partition revoked from consumer-A would be assigned to consumer-B and it would start consuming before offsets from consumer-A were committed (and how it works for eager and cooperative protocols)?
Hi @rtfleg, As per my understanding from other threads, Consumer B does not get assigned the partitions until PartitionsRevokedHandler get completed. In this delegate only you are planning to commit the stored offsets. After this is done then only it will be revoked.
So, before partitions assigned, offsets would have committed.
@mhowlett correct me if am wrong.