nakadi
nakadi copied to clipboard
Keepalive commit
Context: Even thought Nakadi supports streaming of max_uncommitted_events, it still expects a client to commit an event within 60 seconds. Otherwise, a TCP connection gets closed.
Problem: Those consumers, who buffer events locally for a couple of hours before uploading to the downstream storage engine due to processing semantics, create at most once delivery guarantee. The likelihood of consumer, storage or network issues, during a buffer time, is quite high. Thus consumer can lose a whole buffer chunk while already have been committed those events to Nakadi.
Proposed change: As it necessary for Nakadi to keep track of consumer liveness(issue #594), and a data commit in ZK quite expensive, one of the possibility to introduce keepalive commit with fake/artificial offset(aka "BEGIN" or ZERO). Consumers will expect to ACK their healthiness via keep alive commit, however, do a real commit when downstream processing is finished.
Related issue:
- Make commit timeout configurable #594
- stream parameter max_uncommited_events_per_partition #609
ARUHA-986 was added to our backlog.