fluent-plugin-kafka
fluent-plugin-kafka copied to clipboard
Rdkafka doesn't recover after kafka node crash: Local: Fatal error (fatal)
trafficstars
Describe the bug
I have Fluentd forwarding messages to 6 kafka brokers. After crash of single broker, 1/6 of produces will fail:
2022-06-13 09:26:24 +0000 [warn]: #1 [out_kafka_access] Send exception occurred: Local: Fatal error (fatal) at /usr/lib/ruby/gems/2.7.0/gems/rdkafka-0.11.1/lib/rdkafka/producer.rb:167:in `produce'
2022-06-13 09:26:24 +0000 [warn]: #1 [out_kafka_access] failed to flush the buffer. retry_times=11 next_retry_time=2022-06-13 10:02:19 +0000 chunk="5e1505b615b2145b6be8a740f2c72c83" error_class=Rdkafka::RdkafkaError error="Local: Fatal error (fatal)"
2022-06-13 10:02:19 +0000 [warn]: #1 [out_kafka_access] Send exception occurred: Local: Fatal error (fatal) at /usr/lib/ruby/gems/2.7.0/gems/rdkafka-0.11.1/lib/rdkafka/producer.rb:167:in `produce'
2022-06-13 10:02:19 +0000 [warn]: #1 [out_kafka_access] failed to flush the buffer. retry_times=12 next_retry_time=2022-06-13 11:08:44 +0000 chunk="5e1505b615b2145b6be8a740f2c72c83" error_class=Rdkafka::RdkafkaError error="Local: Fatal error (fatal)"
To Reproduce
Configure rdkafka producer, stop one kafka, start it again
Expected behavior
Kafka producer should recover properly.
Your Environment
- Fluentd version: 1.14.6
- TD Agent version:
- fluent-plugin-kafka version: 0.17.5
- ruby-kafka version:
- Operating system:
- Kernel version:
Your Configuration
@type rdkafka2
@id out_kafka
brokers listofbrokers:9999
use_event_time true
topic_key _topic
exclude_topic_key true
default_topic fluentd.unknown
use_default_for_unknown_topic true
exclude_fields $._hash,$._index,$._alert,$._keep,$._sd,$._source,$._syslog_severity,$.kubernetes.labels.pod-template-hash
<format>
@type json
</format>
compression_codec gzip
share_producer true
# NOTE: Idempotent is not supported if acks are not required from
# all ISR, also when enabled, we've seen memory leak on Kafka side
# idempotent true
# Don't wait for acks of all in-sync replicas when receiving
# records, only one is sufficient. This is best option for both
# performance and durability.
#required_acks 1
rdkafka_options {
"enable.idempotence": true
}
ssl_client_cert_key /identity/client.key
ssl_client_cert /identity/client.crt
ssl_ca_cert /identity/ca.crt
<buffer _topic>
@type memory
overflow_action block
chunk_full_threshold 0.9
compress gzip # text,gzip
flush_mode interval # default,interval,immediate,lazy
flush_interval 10s
flush_at_shutdown true
flush_thread_count 4
</buffer>
Your Error Log
2022-06-13 09:26:24 +0000 [warn]: #1 [out_kafka_access] Send exception occurred: Local: Fatal error (fatal) at /usr/lib/ruby/gems/2.7.0/gems/rdkafka-0.11.1/lib/rdkafka/producer.rb:167:in `produce'
2022-06-13 09:26:24 +0000 [warn]: #1 [out_kafka_access] failed to flush the buffer. retry_times=11 next_retry_time=2022-06-13 10:02:19 +0000 chunk="5e1505b615b2145b6be8a740f2c72c83" error_class=Rdkafka::RdkafkaError error="Local: Fatal error (fatal)"
2022-06-13 10:02:19 +0000 [warn]: #1 [out_kafka_access] Send exception occurred: Local: Fatal error (fatal) at /usr/lib/ruby/gems/2.7.0/gems/rdkafka-0.11.1/lib/rdkafka/producer.rb:167:in `produce'
2022-06-13 10:02:19 +0000 [warn]: #1 [out_kafka_access] failed to flush the buffer. retry_times=12 next_retry_time=2022-06-13 11:08:44 +0000 chunk="5e1505b615b2145b6be8a740f2c72c83" error_class=Rdkafka::RdkafkaError error="Local: Fatal error (fatal)"
Additional context
No response