logback-kafka-appender icon indicating copy to clipboard operation
logback-kafka-appender copied to clipboard

Broker reconnection issue

Open madiTG opened this issue 6 years ago • 6 comments

Expected Behavior

We should be able to restart kafka behind load balancers or restart one of the bootstrap servers without causing problems to app itself

Current Behavior

While we put loadbalancer in bootstrap servers:

<appender name="kafkaOutAppender" class="com.github.danielwegener.logback.kafka.KafkaAppender">
        <encoder>
                <pattern></pattern>
        </encoder>
        <topic></topic>
        <keyingStrategy class="com.github.danielwegener.logback.kafka.keying.HostNameKeyingStrategy" />
        <deliveryStrategy class="com.github.danielwegener.logback.kafka.delivery.AsynchronousDeliveryStrategy" />
            <!-- each <producerConfig> translates to regular kafka-client config (format: key=value) -->
            <!-- producer configs are documented here: https://kafka.apache.org/documentation.html#newproducerconfigs -->
            <!-- bootstrap.servers is the only mandatory producerConfig -->
        <producerConfig>bootstrap.servers=1.2.3.4:9092</producerConfig>
        <producerConfig>acks=0</producerConfig>
        <producerConfig>block.on.buffer.full=false</producerConfig>
        <producerConfig>client.id=${HOSTNAME}-${CONTEXT_NAME}-logback-relaxed</producerConfig>
        <producerConfig>compression.type=none</producerConfig>

        <producerConfig>max.block.ms=0</producerConfig>
</appender>

bootstrap.servers=1.2.3.4:9092 is a loadbalancer with three servers behind

After restarting one of the kafka brokers I get reconnection errors in app:

- [Producer clientId=logback-relaxed] Uncaught error in kafka producer I/O thread:
[3/14/18 10:15:33:622 CET] 000000de SystemOut     O [kafka-producer-network-thread | logback-relaxed] cid: clid: E a: o.a.k.c.p.internals.Sender java.lang.NullP
ointerException: null
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:399)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460)
- [Producer clientId=logback-relaxed] Uncaught error in kafka producer I/O thread:
[3/14/18 10:15:33:622 CET] 000000de SystemOut     O [kafka-producer-network-thread | logback-relaxed] cid: clid: E a: o.a.k.c.p.internals.Sender java.lang.NullP
ointerException: null
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:436)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:399)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:460)
- [Producer clientId=logback-relaxed] Uncaught error in kafka producer I/O thread:

madiTG avatar Mar 14 '18 10:03 madiTG

i am also having the same issue

michalkaik avatar Mar 14 '18 11:03 michalkaik

I can not really relate this behavior to the appender but rather the producer itself. but one curious question: Why is your <topic></topic> name empty?

danielwegener avatar Mar 15 '18 13:03 danielwegener

@danielwegener Just removed the sensitive data. My topic name is typed as usual "something-something-something" :)

Found something about

.. , "default.topic.config": {"topic.metadata.refresh.interval.ms": 20000} ..

what do You think?

https://github.com/confluentinc/confluent-kafka-python/issues/59

madiTG avatar Mar 15 '18 14:03 madiTG

Looks like it is a KAFKA bug fixed in kafka clients 1.0.1 https://issues.apache.org/jira/browse/KAFKA-6260 https://issues.apache.org/jira/browse/KAFKA-6682

madiTG avatar Mar 22 '18 11:03 madiTG

Yeah thanks for the research. We should increase our dependency version of kafka-clients to 1.0.1 then. PR's are welcome ;)

danielwegener avatar Oct 29 '18 15:10 danielwegener

do have relation this issue with when the kafka servers shutdown increase the CPU to 100% ?

ripper2hl avatar Aug 14 '21 02:08 ripper2hl