Memory Usage Increase in Log4j2 2.24.3 with Async Logging Due to RingBufferLogEvent
Description
I am using Log4j2 in my project and recently upgraded from version 2.23.1 to 2.24.3. The application employs an asynchronous logging strategy with Disruptor version 3.4.3. Below is the relevant configuration:
log4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelectorlog4j2.asyncQueueFullPolicy=org.apache.logging.log4j.core.async.DiscardingAsyncQueueFullPolicylog4j2.discardThreshold=ERRORlog4j2.asyncLoggerWaitStrategy=Yieldlog4j2.asyncLoggerRingBufferSize=1048576
The logging frequency in my application is relatively low, averaging about 10 log entries per second. When using version 2.23.1, memory usage remained stable based on monitoring data . However, after upgrading to 2.24.3, I observed a gradual increase in memory usage over time, which eventually stabilizes at a fixed threshold.
This behavior differs significantly from what I experienced with 2.23.1. Through heap dump analysis at different time intervals, I confirmed that the memory growth is primarily associated with RingBufferLogEvent objects within Log4j2’s RingBuffer implementation, with the RingBuffer’s memory footprint consistently increasing over time.
I noticed that the 2.24.3 release includes several changes related to memory management. Could any of these changes, particularly those affecting the RingBuffer or RingBufferLogEvent, explain the behavior I am observing?
If additional details are needed, please let me know, and I’d be happy to provide them.
Environment
- Log4j2 Version: 2.24.3 (upgraded from 2.23.1)
- Disruptor Version: 3.4.3
- JVM: Zing24.08.101.0+1 (build 21.0.4.0.101+1-LTS)
- Logging Frequency: ~10 logs/second
Hi @shaonan666,
RingBufferLogEvent does not have a fixed retained size and can gain weight in time if all garbage-free logging options are enabled. This is due to the fact that reusable objects are reused instead of being garbage-collected. A retained size of 744 is nothing alarming.
You should check the following options:
log4j2.enableThreadlocalsenables GC-free logging.- If GC-free mode is enabled
log4j2.messageFactorydefaults toReusableMessageFactoryand aStringBuilderwith variable size (from 128 to 518) is used, as shown by your graphs. - If you additionally enable
log4j2.garbagefreeThreadContextMapeach log event will retain and reuse an instance ofSortedArrayStringMap, which also appears in your graphs.
[!NOTE] Log4j uses a fuzzy-matching algorithm to match both the official spelling of the configuration properties and some obsolete spellings. You should look if your configuration properties have names similar to those above, that start with
log4jororg.apache.logging.
Hi @ppkarwasz
My program isn’t a web app, so Constants#isWebApp returns false at runtime, setting log4j2.enableThreadlocals to true. I didn’t intend to enable GC-free mode, it happened accidentally because my program isn’t a web app, resulting in “half” of GC-free mode being enabled. Now I understand that to fully enable GC-free mode, I also need to set log4j2.garbagefreeThreadContextMap to true.
Additionally, I’ve noticed some version differences:
-
In
2.23.1, iflog4j2.garbagefreeThreadContextMapisn’t enabled, it returns CopyOnWriteSortedArrayThreadContextMap by this logic. -
In
2.24.3, iflog4j2.garbagefreeThreadContextMapisn’t enabled, the program is temporarily identified as a 'WEB_APP_CONTEXT_MAP' and ultimately returns DefaultThreadContextMap. (see Log4jProvider.java)
Based on these observations, I have the some questions:
-
Could the difference between
DefaultThreadContextMapandCopyOnWriteSortedArrayThreadContextMapbe the reason I’m seeing an increase in memory usage? Could you briefly explain the differences between these two classes? I’ve noticed they affect memory usage, and I’d like to understand why. -
Under the same configuration (not a web app), when
log4j2.garbagefreeThreadContextMapisn’t enabled, memory usage in 2.24.3 seems less stable compared to 2.23.1. Is this an expected change?
Based on these observations, I have the some questions:
- Could the difference between
DefaultThreadContextMapandCopyOnWriteSortedArrayThreadContextMapbe the reason I’m seeing an increase in memory usage? Could you briefly explain the differences between these two classes? I’ve noticed they affect memory usage, and I’d like to understand why.
Up to version 2.23.x:
DefaultThreadContextMapwas an implementation ofThreadContextMapbasically equivalent toThreadLocal<HashMap>. Since it only bound classes from the bootstrap classloader to threads, it was safe to use in web applications without memory leaks.DefaultThreadContextMapwas also a copy-on-write implementation, which allowed us to transfer the maps returned by it to other threads.CopyOnWriteSortedArrayThreadContextMapwas a more performant copy-on-write implementation, but it bound Log4j classes to threads, so it wasn't safe to use in web applications.
In version 2.24.0 a version of DefaultThreadContextMap was introduced that was both webapp-safe and fast (see #2330). The package-private CopyOnWriteSortedArrayThreadContextMap was removed.
Since both implementations use immutable objects RingBufferLogEvent will not keep references to them:
https://github.com/apache/logging-log4j2/blob/93afabcd3817738454e38f4cf65dfa84268a2eb1/log4j-core/src/main/java/org/apache/logging/log4j/core/async/RingBufferLogEvent.java#L451-L459
Your log events retain instances of SortedArrayStringMap, which seems to indicate that GarbageFreeSortedArrayThreadContextMap is used instead. In code you can check it with ProviderUtil.getProvider().getThreadContextMapInstance() (since 2.24.0), while in previous versions with ThreadContext.getThreadContextMap() (which returns null for the default map).
- Under the same configuration (not a web app), when
log4j2.garbagefreeThreadContextMapisn’t enabled, memory usage in 2.24.3 seems less stable compared to 2.23.1. Is this an expected change?
The use of DefaultThreadContextMap will generate a little bit more temporary objects than CopyOnWriteSortedArrayThreadContextMap, in particular instances of JdkMapAdapterStringMap. These should however have no influence on the amount of memory retained in the long term.
Right now, my best bet on why your system behaves the way he does is that GarbageFreeSortedArrayThreadContextMap is somehow used.
Hi @ppkarwasz
I checked the dump file and code, and it seems GarbageFreeSortedArrayThreadContextMap isn’t used.
In code you can check it with ProviderUtil.getProvider().getThreadContextMapInstance()
In version 2.24.3, using ProviderUtil.getProvider().getThreadContextMapInstance(), I got DefaultThreadContextMap because GC_FREE_THREAD_CONTEXT_PROPERTY isn’t enabled
https://github.com/apache/logging-log4j2/blob/4aa5efec7f2ee2d4623381cd66c3df3a16a6b9a9/log4j-core/src/main/java/org/apache/logging/log4j/core/impl/Log4jProvider.java#L113-L116
https://github.com/apache/logging-log4j2/blob/4aa5efec7f2ee2d4623381cd66c3df3a16a6b9a9/log4j-core/src/main/java/org/apache/logging/log4j/core/impl/Log4jProvider.java#L127-L129
in previous versions with ThreadContext.getThreadContextMap() (which returns null for the default map).
In version 2.23.1, using ThreadContext.getThreadContextMap() (which returns null for the default map), I got CopyOnWriteSortedArrayThreadContextMap based on this logic
Constants.ENABLE_THREADLOCALSis true becauseisWebAppis falseGcFreeThreadContextKeyis false by default value
https://github.com/apache/logging-log4j2/blob/fea2a7116160fb1555d578406444b4fc4f0ef2da/log4j-api/src/main/java/org/apache/logging/log4j/spi/ThreadContextMapFactory.java#L139-L146
Additionally, I couldn’t find GarbageFreeSortedArrayThreadContextMap in the dump files for either version.
@shaonan666,
I was looking at the heap dump you shared before:
I am not able to reproduce this state, unless I add an Additional Context Property to my logger configuration:
<Root level="INFO">
<AppenderRef ref="FILE"/>
<Property name="key" value="value"/>
</Root>
Can you share your log4j2.xml configuration file?
Hi @ppkarwasz
This is my log4j2.xml configuration file. Could you please review it for me?
<?xml version="1.0" encoding="UTF-8"?>
<Configuration status="INFO">
<properties>
<Property name="log4j2.path" value="${sys:logging.path:[email protected]@}"/>
<Property name="log4j2.immediateFlush" value="${sys:logging.immediateFlush:[email protected]@}"/>
<Property name="log4j2.level" value="${sys:logging.level:[email protected]@}"/>
<Property name="log4j2.max.file.size" value="xxxMB"/>
<Property name="log4j2.max.file.count" value="xxx"/>
<Property name="log4j2.max.file.time" value="xxx"/>
<Property name="PID">????</Property>
<Property name="LOG_EXCEPTION_CONVERSION_WORD">%xwEx</Property>
<Property name="LOG_LEVEL_PATTERN">%5p</Property>
<Property name="LOG_DATEFORMAT_PATTERN">yyyy-MM-dd HH:mm:ss.SSS</Property>
<Property name="CONSOLE_LOG_PATTERN">xxxx</Property>
<Property name="FILE_LOG_PATTERN">xxx</Property>
</properties>
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="${CONSOLE_LOG_PATTERN}"/>
</Console>
<RollingRandomAccessFile name="FILE_INFO"
fileName="${log4j2.path}/info.log"
filePattern="${log4j2.path}/archive/history_info-%d{yyyy-MM-dd}.%i.zip"
immediateFlush="${log4j2.immediateFlush}">
<PatternLayout pattern="${FILE_LOG_PATTERN}"/>
<Policies>
<TimeBasedTriggeringPolicy interval="1" modulate="true"/>
<SizeBasedTriggeringPolicy size="${log4j2.max.file.size}"/>
</Policies>
<DefaultRolloverStrategy max="${log4j2.max.file.count}">
<Delete basePath="${log4j2.path}/archive/" maxDepth="2">
<IfFileName glob="*.zip"/>
<IfLastModified age="${log4j2.max.file.time}"/>
</Delete>
</DefaultRolloverStrategy>
</RollingRandomAccessFile>
<RollingRandomAccessFile name="FILE_ERROR"
fileName="${log4j2.path}/error.log"
filePattern="${log4j2.path}/archive/history_error-%d{yyyy-MM-dd}.%i.zip"
immediateFlush="${log4j2.immediateFlush}">
<Filters>
<ThresholdFilter level="WARN" onMatch="ACCEPT" onMismatch="DENY"/>
</Filters>
<PatternLayout pattern="${FILE_LOG_PATTERN}"/>
<Policies>
<TimeBasedTriggeringPolicy interval="1" modulate="true"/>
<SizeBasedTriggeringPolicy size="${log4j2.max.file.size}"/>
</Policies>
<DefaultRolloverStrategy max="${log4j2.max.file.count}">
<Delete basePath="${log4j2.path}/archive/" maxDepth="2">
<IfFileName glob="*.zip"/>
<IfLastModified age="${log4j2.max.file.time}"/>
</Delete>
</DefaultRolloverStrategy>
</RollingRandomAccessFile>
</Appenders>
<Loggers>
<Root level="${log4j2.level}" includeLocation="false">
<AppenderRef ref="Console"/>
<AppenderRef ref="FILE_INFO"/>
<AppenderRef ref="FILE_ERROR"/>
</Root>
</Loggers>
</Configuration>
@shaonan666,
Looking more carefully at your memory graphs, I think that the expansion of StringBuilder instances in the RingBufferLogEvent due to GC-free logging is enough to explain the gradual increase in retained memory:
- An never used
RingBufferLogEventretains 176 bytes of memory. I contains an empty instance ofSortedArrayStringMapthat will be removed after the first usage and noStringBuilderinstance. - If the event is used to log a short message, its size increased to 352 bytes of memory, due to the instantiation of the
StringBuilder. - For longer message, the size increases, but is capped at 744 bytes.
Given the huge RingBuffer you are using (our default in GC-free mode is merely 4096) you should expect an increase of 176 MiB in retained memory size once all the RingBufferLogEvents have been used at least once.
Since only 40% of your log events have been used, you experienced an increase of around 70 MiB.
Hi @ppkarwasz
StringBuilder instances in the RingBufferLogEvent due to GC-free logging
Does this refer to the fact that I only configured log4j2.enableThreadlocals=true (due to isWebApp=false)?
Given the huge RingBuffer you are using (our default in GC-free mode is merely 4096) you should expect an increase of 176 MiB in retained memory size once all the RingBufferLogEvents have been used at least once.
However, with the same configuration and code in version 2.23.1, why didn’t I observe this memory increase? What specific code changes from 2.23.1 to 2.24.3 might have caused this memory behavior change?
Thank you for your patient explanations.
However, with the same configuration and code in version 2.23.1, why didn’t I observe this memory increase? What specific code changes from 2.23.1 to 2.24.3 might have caused this memory behavior change?
Apparently the behavior in 2.23.1 was due to a bug (see #2379), which disabled part of the GC-free behavior.
To draw some conclusions from this thread, what are your expectations regarding GC-free mode?
- Do you expect Log4j Core to pre-allocate
StringBuilders of sizelog4j2.initialReusableMsgSizeinstead of doing it on-demand? - If the
ThreadContextMapis in GC-free mode the maps that contain context data never shrink. Should there be a configuration option to shrink them, when they are cleared? cc/@vy
If the
ThreadContextMapis in GC-free mode the maps that contain context data never shrink. Should there be a configuration option to shrink them, when they are cleared?
@ppkarwasz, keeping the TCM size under a certain threshold sounds like a good thing, similar to what we do with StringBuilders using StringBuilders::trimToMaxSize and the log4j.layoutStringBuilder.maxSize property. The brutal truth is, as I stated earlier a couple of times, TCM really needs a rewrite.
Hi @ppkarwasz
I think I’ve got it. Let me describe what happened again to ensure I fully understand.
My program doesn’t specify a messageFactory.
In version 2.23.1, LoggerContext#getLogger(String name) ultimately initializes a ParameterizedMessageFactory
https://github.com/apache/logging-log4j2/blob/fea2a7116160fb1555d578406444b4fc4f0ef2da/log4j-core/src/main/java/org/apache/logging/log4j/core/LoggerContext.java#L493-L495 https://github.com/apache/logging-log4j2/blob/fea2a7116160fb1555d578406444b4fc4f0ef2da/log4j-api/src/main/java/org/apache/logging/log4j/spi/AbstractLogger.java#L134-L137
In version 2.24.3, LoggerContext#getLogger(String name) uses a GC-free-compliant ReusableMessageFactory based on the configuration log4j2.enableThreadlocals=true
https://github.com/apache/logging-log4j2/blob/4aa5efec7f2ee2d4623381cd66c3df3a16a6b9a9/log4j-core/src/main/java/org/apache/logging/log4j/core/LoggerContext.java#L93-L97
https://github.com/apache/logging-log4j2/blob/4aa5efec7f2ee2d4623381cd66c3df3a16a6b9a9/log4j-core/src/main/java/org/apache/logging/log4j/core/LoggerContext.java#L538-L540
These two different MessageFactory implementations are the main reason I’ve observed a gradual increase in memory usage: to support GC-free, ReusableMessageFactory allocates objects that aren’t released until all RingBufferLogEvent instances have been used at least once, at which point the memory growth stops.
Do you expect Log4j Core to pre-allocate StringBuilders of size log4j2.initialReusableMsgSize instead of doing it on-demand?
Pre-allocating a suitably sized StringBuilder based on prediction could be a good choice for services with high runtime latency requirements. But my main concern is the gradual memory increase in version 2.24.3, which I didn’t see in 2.23.1, and I initially suspected a memory leak. However, based on the current situation, it doesn’t seem to be a memory leak risk. I’m considering reducing the RingBuffer size to manage memory usage.
Thank you for your patient explanations!