siddhi-io-kafka icon indicating copy to clipboard operation
siddhi-io-kafka copied to clipboard

Double reading of events from Kafka topic when enable.offsets.commit = "false"

Open xtrmstep opened this issue 4 years ago • 2 comments

Hi,

Have tested Siddhi app and bumped into the issue with setting enable.offsets.commit = "false". When it's set, the source may start reading same events twice duplicating them in the sink. "May" - because there is no stable reproducing steps. Usually it happens after some period of using the source with that property set to false. Or it may start behave like this from very beginning. Most likely, some memory garbage stays in memory and has this side effect.

The issue is disappearing when this setting set to it's default value (removed). Or if group.id is changed. But it may not always disappear after the 2nd.

Can it be fixed so that consumer reads only one event and not duplicate, when offset is not committed?

xtrmstep avatar Jun 18 '20 08:06 xtrmstep

Hi @xtrmstep,

To achieve exactly once processing semantics, you need to set:

enable.offsets.commit = true 
enable.auto.commit=false

With above in place, Siddhi will commit the offset for each message consumed hence no message will be duplicated.

I am not sure whether I have understood your question correctly but, why you expect the consumer to process only once, while setting enable.offsets.commit = "false"?

Thank you, Dilini

dilini-muthumala avatar Jun 24 '20 15:06 dilini-muthumala

@dilini-muthumala This was during testing. So I've got a topic in Kafka with prepared events in it (let's say event #1, event #2, event #3). Assuming enable.offsets.commit = "false", my testing steps would look like follows:

  1. run Siddhi app (it reads all events, but not commit the offset)
  2. stop the app
  3. make changes if needed
  4. run again (it should read same topic from the beginning)

So the app needs to read same events each time I run it. But actually it was reading events like this: [#1, #1], [#2, #2], [#3, #3] - this is what I've got in Siddhi source stream.

I solved that only by removing that setting and each time removing the registered group id in Kafka (or changing it on each run).

Regards,

xtrmstep avatar Jun 24 '20 17:06 xtrmstep