RADAR-Backend Reliability vs high throughput

Kafka messages can be consumed according to two different strategies:

at most once messages can be lost due to consumer faults
at least once messages can be read twice due to consumer faults

This design choice drives the consumer implementation.

We may plan to code both approaches, and select the most suitable after tests.

EDIT @blootsvoets: reversed at most and at least

Sep 29 '16 08:09 fnobilia

Since we add timestamps to all messages, I'd opt for at least once processing to start, we can always establish unicity by the timestamp. It may be harder to retrieve lost data.

Sep 29 '16 09:09 blootsvoets

Yes but this approach open another issue.

We would use group of consumer to maximise the throughput. Groups implies that Kafka periodically and automatically re-assigns partitions across consumers inside a single group for balancing the work-load. This results in a new generation of the group. After the rebalance, Consumer_X may manage partitions previously handled by consumer_Y. The last seen timestamp per each key cannot be stored locally (e.g. inside each consumer), this information has to be stored in a data-structure shared among all group members. A NonBlockingHashMap could solve this issue.

Sep 29 '16 10:09 fnobilia

@fnobilia Presumably this is going to depend on the requirements of the Consumer / use case? I think At-Least-Once is generally recommended though.

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIgetexactly-oncemessagingfromKafka? As far as I'm aware Exactly-Once-Consumption can be done but Consumers need to locally transactionally checkpoint. However in practice this overhead also places constraints on throughput.

Also take a look at Flink Checkpointing http://data-artisans.com/kafka-flink-a-practical-how-to/

Sep 29 '16 10:09 afolarin

Unfortunately the exactly-once is under active investigation (1 and 2).

Apache Flink may be useful in near future to provide fault-tolerance and scalability within the analysis layer.

Sep 29 '16 11:09 fnobilia

RADAR-Backend RADAR-Backend copied to clipboard

Reliability vs high throughput

RADAR-Backend
RADAR-Backend copied to clipboard