sentry-java icon indicating copy to clipboard operation
sentry-java copied to clipboard

UncaughtExceptionHandlerIntegration leaks memory

Open xenomote opened this issue 1 year ago • 9 comments

Integration

sentry

Java Version

17

Version

7.6.0

Steps to Reproduce

Cannot be very specific because sentry SDK is being used by a plugin within a framework, and within that environment we cannot reproduce the issue unless we are receiving production traffic

Expected Result

no memory leak

Actual Result

the UncaughtExceptionHandlerIntegration appears to occupy the majority of the heap, eventually resulting in memory exhaustion and a crash.

On auditing the code it appears that https://github.com/getsentry/sentry-java/blob/28c9a83af55b119517dbd2fcf74975fb5822907b/sentry/src/main/java/io/sentry/UncaughtExceptionHandlerIntegration.java#L68-L81 causes a stack to form, which prevents any of the exeption handlers from being GC'd and results in OOM. No idea why that register method is being called over and over again

image

we are planning to simply turn the UncaughtExceptionHandler off in the options to resolve this issue, but it's weird that the Sentry SDK apparently behaves this way by default

xenomote avatar Mar 14 '24 19:03 xenomote

I might be reading it wrong, but it looks like defaultExceptionHandler is already of instance UncaughtExceptionHandlerIntegration, which means someone set it before us manually, or the sdk has been initialized multiple times without calling Sentry.close() in between.

romtsn avatar Mar 14 '24 20:03 romtsn

Yes I agree with you Roman, that does seem to be the case

Sadly I can't really give any detail on how this would be the case because it's a library used by a plugin used by a framework, all of which I have no direct control over aside from config

From what I understand from debugging the whole stack, init should only be called once at framework startup when the plugin is initialised, and captureEvent is called for each event to be logged

I have no idea how or why the register method is being called multiple thousands of times, but my assumption would have been that init would be idempotent

xenomote avatar Mar 15 '24 09:03 xenomote

hm, but do you see that register is called multiple thousands of times actually?

My explanation would be that we just get into a infinite recursive call, because we delegate to defaultExceptionHandler here https://github.com/getsentry/sentry-java/blob/28c9a83af55b119517dbd2fcf74975fb5822907b/sentry/src/main/java/io/sentry/UncaughtExceptionHandlerIntegration.java#L134

and since it's our own handler, we'll get into the loop.

One improvement from our side would be to just check if defaultExceptionHandler is not instance of UncaughtExceptionHandlerIntegration before delegating to it to avoid the loop, that should not hurt.

romtsn avatar Mar 15 '24 10:03 romtsn

it's a library used by a plugin used by a framework,

also, just realised - we don't really support this case out of the box (using our sdk in a library), because there might be another library/plugin which uses sentry and then they will have a conflict (including sending events to a wrong dsn/project). I assume this is precisely what happens here actually.

romtsn avatar Mar 15 '24 10:03 romtsn

hm, but do you see that register is called multiple thousands of times actually?

my evidence for this is that defaultExceptionHandler is only ever assigned once- in the register method. To be clear, I don't think it is the register method being called on the same instance of UncaughtExceptionHandlerIntegration, because the register boolean gate will prevent it. It seems more like new instances of the handler are being created and registered repeatedly, thus forming a stack

because there might be another library/plugin which uses sentry

I can categorically guarantee that is not the case for my particular situation, we only have this one plugin which uses the sentry SDK, and no other component in the entire system makes use or reference of the Sentry SDK in any way

As far as I can understand it, the framework we use only initialises one instance of the plugin, and that one plugin instance only calls Sentry.init() once at startup, so I don't see how register is being called repeatedly, unless it is somehow triggered by a background thread within the Sentry SDK, or hub.captureEvent somehow causes it per-call

As a side note- hypothetically speaking, if calling init more than once is unsupported and results in this kind of unpleasant side effect, I would expect some critical error log or runtime exception to explicitly crash the application so that it's extremely obvious when this kind of misconfiguration happens so this kind of problem will reveal itself in development, not production

If crashing doesn't seem like a palatable choice then I would at least expect init to be idempotent with a prominent warning about init being called more than once. As it stands, I don't even know if this is the problem, but if it is then there was no way I could have known about it until we had it in prod and it killed a mission critical system with OOM :cry:

xenomote avatar Mar 15 '24 10:03 xenomote

yeah the init call is supposed to be idempotent, and we do print a warning here https://github.com/getsentry/sentry-java/blob/28c9a83af55b119517dbd2fcf74975fb5822907b/sentry/src/main/java/io/sentry/Sentry.java#L222-L228

without testing it locally however, I cannot say if we've overlooked something to clean up between multiple init calls, was just posting my assumptions here. Maybe @adinauer can take a closer look, I'm out of bandwidth for this unfortunately

romtsn avatar Mar 15 '24 11:03 romtsn

no worries. Thank you for your attention Roman

I apologise for the somewhat vague issue report, but I'm only able to tell you as much as I know myself :sweat_smile:

the init call is supposed to be idempotent, and we do print a warning here

Not sure where that is expected to be logged (in sentry itself or in stderr/stdout), but I don't see any evidence of it anywhere. My assumption is that init is only ever called once, but as I say I can't exactly confirm it. The fact that I don't see the warning printed anywhere would support that assumption

xenomote avatar Mar 15 '24 11:03 xenomote

Hi @xenomote, Here's a short update on this issue: I was able to reproduce locally now. We are currently working on a fix for this issue.

Thank you for your patience, we'll keep you updated.

lbloder avatar Apr 02 '24 12:04 lbloder

excellent, thanks for your attention

I am not personally blocked on this issue, because I wasn't using this functionality anyway- I just disabled in the init config. But for the sake of others it would still be positive to fix :+1:

xenomote avatar Apr 02 '24 14:04 xenomote

:tada:

xenomote avatar Jun 25 '24 14:06 xenomote