logging-log4j2
logging-log4j2 copied to clipboard
Appender may caused deadlock on AsyncLoggerDisruptor
Description
Log operation is blocked when the RingBuffer is full, then waiting for appender to consume LogEvent's from the RingBuffer, but the appender itself may operate logging before the consumption to trigger deadlock.
Here is my actual case in production, I'm using Kafka appender, the Kafka client logging caused deadlock, see TransactionManager.java
"kafka-producer-network-thread | producer-1" #44 daemon prio=5 os_prio=0 tid=0x00007f468310c800 nid=0x256255 waiting for monitor entry [0x00007f467ad6c000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:246)
- waiting to lock <0x0000000701070490> (a java.lang.Object)
at org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246)
at org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230)
at org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:202)
at org.apache.logging.log4j.core.async.AsyncLogger.access$100(AsyncLogger.java:67)
at org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:157)
at org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:130)
at org.apache.logging.slf4j.Log4jLogger.log(Log4jLogger.java:378)
at org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.writeLog(LogContext.java:434)
at org.apache.kafka.common.utils.LogContext$LocationAwareKafkaLogger.info(LogContext.java:387)
at org.apache.kafka.clients.producer.internals.TransactionManager.setProducerIdAndEpoch(TransactionManager.java:505)
at org.apache.kafka.clients.producer.internals.TransactionManager.bumpIdempotentProducerEpoch(TransactionManager.java:542)
at org.apache.kafka.clients.producer.internals.TransactionManager.bumpIdempotentEpochAndResetIdIfNeeded(TransactionManager.java:559)
- locked <0x00000007016c0898> (a org.apache.kafka.clients.producer.internals.TransactionManager)
at org.apache.kafka.clients.producer.internals.Sender.runOnce(Sender.java:330)
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:247)
at java.lang.Thread.run(Thread.java:748)
"http-nio-8080-exec-273" #3682 daemon prio=5 os_prio=0 tid=0x00007f465b422800 nid=0x2908d3 runnable [0x00007f4648b48000]
java.lang.Thread.State: TIMED_WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
at com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
at org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:247)
- locked <0x0000000701070490> (a java.lang.Object)
at org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246)
at org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230)
at org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:225)
at org.apache.logging.log4j.core.async.AsyncLogger.access$000(AsyncLogger.java:67)
at org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:152)
at org.apache.logging.log4j.core.async.AsyncLogger.log(AsyncLogger.java:136)
at org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2168)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2122)
at org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2105)
at org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:1997)
at org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1862)
at org.apache.logging.slf4j.Log4jLogger.info(Log4jLogger.java:185)
Proposal
The issue is not Kafka specified, log4j should discard LogEvent from appenders if the underlying RingBuffer is full.
Even if it is not a Kafka client component , any other client component that integrates log4j2 has similar issues. The client component that integrates log4j2 also produces log event.
Disable log4j2 for client component. Increase the capacity of RingBuffer.
Disable log4j2 for client component.
Is there a way to disable logging for particular lib except increasing logging level threshold?
Increase the capacity of RingBuffer.
Not an option, it will be full soon or later.
[!TIP] TL;DR: Please set the
log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFullLog4j configuration property tofalse. See Property Sources for a list of ways to do it.
@quaff,
There are already a lot of configuration properties that regulate what happens when the queue becomes full:
log4j2.asyncQueueFullPolicydecides what to do when the queue becomes full: by default Log4j Core "waits" until the queue becomes empty. You can set this toDiscardto discard the event.log4j2.asyncLoggerWaitStrategydecides what does "wait" means. Some of the wait strategies are quite CPU intensive: they don't yield the control of the thread.log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFullmakes the wait strategy less CPU intensive by blocking when the queue is full.
Note that Log4j Core never blocks on the asynchronous thread used by AsyncLoggerDisruptor:
https://github.com/apache/logging-log4j2/blob/b593be7d881bb4c46d8ea09e14bb4ee9dea9567b/log4j-core/src/main/java/org/apache/logging/log4j/core/async/AsyncLoggerDisruptor.java#L331-L340
Your problem arises, because the Kafka appender uses its own background thread to send log events and Log4j Core does not know that it should not block on that thread. You have several solutions to this problem:
- You can simply tell Log4j Core not to block on any thread by setting
log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFulltofalse. - You can stop using asynchronous loggers with the Kafka Appender: IMHO it doesn't make sense to have two asynchronous barriers that log events must pass before being logged.
- You can use a synchronous logger for
org.apache.kafkaand an asynchronous logger for the rest (see Mixed sync/async loggers):<AsyncRoot level="INFO"> <AppenderRef ref="KAFKA"/> </AsyncRoot> <Logger name="org.apache.kafka"> <AppenderRef ref="CONSOLE"/> </Logger>
[!WARNING] To switch from the "All asynchronous loggers" to the "Mixed sync/async loggers", you need to restore the default value of the
log4j2.contextSelectorconfiguration property.
@quaff, @ppkarwasz is spot on with his analysis and suggestions. My 2 cents are...
- Stick to the simplest setup (i.e., only use synchronous loggers) unless you can prove that this causes a significant performance regression for your application.
- Kafka Appender is planned to be removed. You should consider more future-proof alternatives.
Thanks for your quick response.
Currently I'm setting log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector, and using two type of appenders, RollingFile for local file and Kafka for centralized log collection, I think Mixed sync/async is suitable for such use case that make RollingFile async and Kafka sync, please correct me if I'm wrong.
log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFulltofalse.
The correct system property key is AsyncLogger.SynchronizeEnqueueWhenQueueFull.
Thanks for your quick response. Currently I'm setting
log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector, and using two type of appenders,RollingFilefor local file andKafkafor centralized log collection, I think Mixed sync/async is suitable for such use case that makeRollingFileasync andKafkasync, please correct me if I'm wrong.
Using RollingFile asynchronously for the org.apache.kafka package and Kafka synchrounously for the rest should work, although I would recommend to only use synchrounous loggers.
When you use asynchronous elements, you loose the ability of receiving LoggingExceptions if an appender fails. If logging is important for your application, consider using a synchronous setup with a Failover appender that uses Kafka as primary and Rollingfile as failover.
log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFulltofalse.The correct system property key is
AsyncLogger.SynchronizeEnqueueWhenQueueFull.
That is the legacy pre-2.10 name of the configuration property. While it still works for backward compatibility, AsyncLogger.SynchronizeEnqueueWhenQueueFull is deprecated and you should use
log4j2.asyncLoggerSynchronizeEnqueueWhenQueueFullif you use Java System properties or alog4j2.component.propertiesfileLOG4J_ASYNC_LOGGER_SYNCHRONIZE_ENQUEUE_WHEN_QUEUE_FULLif you want to set an OS environment variable.
This issue is stale because it has been waiting for your feedback for more than 60 days. The Apache Logging Services community values every submitted issue, but without additional information from you, we are unable to provide a solution to your problem. Please comment on this issue or it will be closed in 7 days.