opentelemetry-java-instrumentation
opentelemetry-java-instrumentation copied to clipboard
TracingSubscriber excessive memory usage - 81% memory of io.lettuce.core.RedisPublisher$SubscriptionCommand
Describe the bug
I won't called it a bug but is something unexpected. Under high load our reactive service get OOM. The cause is the CommandHandler#stack
lettuce-core deque (please read all to find out why I opened the issue here). I've already read this and the possibility to bound the queue. The service won't go OOM but still an exception will be thrown.
I have a blocking version of the same service and, under the same load, it doesn't go OOM. Probably because the concurrent request are capped to 200 (Tomcat default). Anyway having a reactive stack that doesn't handle the same load of a non reactive stack is something unexpected right?
I had a look at the thread dump:
this variable is taking almost 500MB of memory (~80% of the total available memory). Looking at the command stored shows that:
TracingSubscriber
is making around 81% of the total command memory - 85KB. I opened an issue here and not on lettuce-core
because the problem here seems the excessive amount of memory this object is taking in comparison to the command itself.
I perform the same load test without running the java opentelemetry agent and indeed the services didn't go OOM.
Steps to reproduce
Run load test on a instrumented reactive service using the reactive lettuce client.
Expected behavior
I'd expect less memory used by TracingSubscriber
Actual behavior
TracingSubscriber
takes 81% of the totaly memory allocated for the redis command
Javaagent or library instrumentation version
1.29.0
Environment
lettuce-core:6.1.9
Additional context
No response
Could you provide a sample application along with any instructions needed to reproduce the issue.
Hello, as soon as I find the time I'll try to provide an example. Just as additional info, I tried limit the CommandHandler
request queue size and even if the total CommandHandler
size decreased there is still huge memory consumption (causing OOM), apparently still from Opentelemetry in other objects:
For now I just deactivated it.
Hi @SimoneGiusso,
As @laurit mentioned, a reproducer app would be helpful. Or you may provide the heap dump file, so we can make our own analysis.
It is hard to say without having a heap dump file, but I think, most of the memory usage of the TracingSubscriber
instances are occupied by io.opentelemetry.context.Context
(for ex. io.opentelemetry.context.ArrayBasedContext
) instances.