kafka-flow icon indicating copy to clipboard operation
kafka-flow copied to clipboard

Improve graceful shutdown sequence

Open nikitapecasa opened this issue 4 years ago • 0 comments
trafficstars

Currently it persists the state without updating committable offsets, so after shutdown we have updated state in persistent storage but no committed offsets, which is not a critical issue by itself as any sane application should support deduplication of input messages, meaning that when we start an app again it would load latest state and re-process some messages by simply ignoring them.

Improved graceful shutdown sequence should look like

  • finish processing of current messages we got from consumer.poll
  • persist the state
  • collect committable offsets after state is persisted (currently TimerFlowOf's release does nothing related to offsets on release)

with following conditions

  • if messages' processing fails, no persistence and no commit is required
  • if state's persisting fails, committable offsets cannot be updated and no commit is required

A minimum set of tests needs to be added

  • if state is persisted, correct offsets are committed, so that on next app run there's no re-processing of messages
  • if state's persistence failed, then commit should not happen

Also it might affect a special case of graceful shutdown for a single PartitionFlow instead of the whole app

  • on consumer group's rebalance some partitions might got removed from current consumer instance
  • as a result corresponding PartitionFlow would be removed from TopicFlow's cache, triggering the release of PartitionFlow, which would flush/persist the state if corresponding config is enabled (flushOnRevoke)
  • current implementation does not try to commit any offsets afterwards, it simply removes/forgets pending commits

Chances are quite high that it won't be an easy task as

  • we don't have access to pending offsets to be committed in TimerFlowOf
  • shutdown sequence is a result of a composition of different resources, so multiple files have to be adjusted, mb even some considerable part of the processing flow and the way we assemble the structure of Topic/Partition/Key/Timer flows

nikitapecasa avatar Sep 23 '21 10:09 nikitapecasa