fluent-plugin-kafka icon indicating copy to clipboard operation
fluent-plugin-kafka copied to clipboard

in_kafka_group: send heartbeat if necessary

Open GiedriusS opened this issue 3 years ago • 1 comments
trafficstars

We've spotted that processing a batch of Kafka messages in fluentd sometimes takes a long time and because of that Kafka consumer groups go into a constant rebalancing state. Fluentd lacks observability to understand what plugin is taking a long time so adding this as a stop gap solution for the moment. It's not trivial to add duration histograms to Fluentd so this is a stop-gap solution for now.

Kudos to @marijus-ravickas for the idea. More details available at https://www.marionzualo.com/2019/08/10/always-manually-send-heartbeats-when-processing-message-batches-in-ruby-kafka/.

Tests still pass:

37 tests, 42 assertions, 0 failures, 0 errors, 0 pendings, 7 omissions, 0 notifications         
100% passed

GiedriusS avatar Aug 11 '22 07:08 GiedriusS

Looking at ruby-kafka source code https://github.com/zendesk/ruby-kafka/blob/292f2d5a5b8ef4572485617121bf61a6725414cf/lib/kafka/consumer.rb#L359, it looks like they are already triggering heartbeat after every batch, so I don't think this change will yield any behavioural differences.

From what you've described, I would be interested to look at your session.timeout.ms (https://kafka.apache.org/documentation/#consumerconfigs_session.timeout.ms, which should default to 30s in ruby-kafka, this can be set via the session_timeout parameter), and see if it makes sense to bump that.

raytung avatar Aug 16 '22 03:08 raytung

This PR has been automatically marked as stale because it has been open 90 days with no activity. Remove stale label or comment or this PR will be closed in 30 days

github-actions[bot] avatar Nov 14 '22 10:11 github-actions[bot]

So, should we decline this PR and add descriptions for processing high volume log?

cosmo0920 avatar Nov 15 '22 15:11 cosmo0920

@cosmo0920 I think so. If the author finds a need for this again, they can re-open the PR in a future date

raytung avatar Dec 06 '22 00:12 raytung

OK. Let's close this PR then.

cosmo0920 avatar Dec 06 '22 14:12 cosmo0920