sentry-cocoa icon indicating copy to clipboard operation
sentry-cocoa copied to clipboard

Stack tracer is crashing during captureException

Open kode54 opened this issue 1 year ago • 17 comments

Platform

macOS

Environment

Production

Installed

Swift Package Manager

Version

8.46.0

Xcode Version

16.2

Did it work on previous versions?

No idea, this has happened a few times

Steps to Reproduce

  1. App throws NSException during a work queue for tag reading
  2. Exception is pushed to [SentrySDK captureException:]
  3. App crashes

Expected Result

Exception should be logged, not sure if it should be fatal.

Actual Result

App crashes on this thread dump:

Image
NSInvalidArgumentException: *** -[__NSArrayM insertObject:atIndex:]: object cannot be nil

Are you willing to submit a PR?

I am willing to submit a PR, if I can figure out where it went wrong.

kode54 avatar Mar 10 '25 21:03 kode54

Hi @kode54, thanks for reporting this.

Are you able to share a link to the Sentry issue so we can take a closer look at the crash data?

philprime avatar Mar 11 '25 08:03 philprime

I'll have to give you an account on my self-hosted instance to access it.

kode54 avatar Mar 11 '25 09:03 kode54

Hmm, come to think of it, it may just be showing the stack trace from where I captured the exception, but the exception is correct for what was thrown by the @ try @ catch block. In this case, it was a NSString stringWithUTF8String being passed an invalid UTF-8 constant, and my fallback code not trying other string encodings, then some subsequent code attempting to assign the nil NSString* to an NSArray.

kode54 avatar Mar 11 '25 09:03 kode54

I actually just noticed the same behavior in a completely unrelated issue where the error was caught and the reported using Sentry.captureException at a different position in the code.

cc @philipphofmann would you say it is expected for the stack trace to point to the position where it was caught, rather than the stack trace of the original error that was caught?

internal case

philprime avatar Mar 11 '25 12:03 philprime

cc @philipphofmann would you say it is expected for the stack trace to point to the position where it was caught, rather than the stack trace of the original error that was caught?

Yes it is, but it often confuses users.

We have already removed those frames in the past, but then we sometimes remove too little or too much. It's complicated to remove the exact number of frames when you have no symbolication. Especially if the code changes. Then, we agreed on keeping the frames as it's also more accurate for letting the user know how we build the stacktrace. Some users actually complained that we removed the Sentry frames. We should document this somewhere so we don't have to keep writing this answer.

philipphofmann avatar Mar 11 '25 15:03 philipphofmann

When using captureException the internal methods are also added to the stack grace, seemingly looking like it crashed in the SDK. We will improve this in https://github.com/getsentry/sentry/issues/87076.

But in this case it looks like there might actually be an error occuring in buildStacktraceForCurrentThread so we'll investigate this.

philprime avatar Mar 19 '25 13:03 philprime

Can we look instead at using the members of the NSException to build the stack trace?

https://developer.apple.com/documentation/foundation/nsexception?language=objc#1651286

It has callStackReturnAddresses and callStackSymbols.

kode54 avatar Mar 30 '25 02:03 kode54

@kode54 the NSException crash monitor should already use the callStackReturnAddresses here:

https://github.com/getsentry/sentry-cocoa/blob/fbb76563c954bab9c13baa2b70a1f0962230c97f/Sources/SentryCrash/Recording/Monitors/SentryCrashMonitor_NSException.m#L69-L73

There's also #741 where @philipphofmann recently brought up that the callStackReturnAddresses could be used for it.

philprime avatar Mar 31 '25 08:03 philprime

What about if I'm catching the exception myself and passing it to captureException?

Edit: I'm getting a useless trace that starts from where captureException is called and ends at -[SentryStacktraceBuilder buildStacktraceForCurrentThread]

kode54 avatar Mar 31 '25 08:03 kode54

I digged into the code again and while the crash monitor uses the callStackReturnAddresses, the SentryStacktraceBuilder does not - sorry for the confusion on my side.

To collect some insights for us to understand and decide how we could continue here, here a summary of the status quo:

When using captureException it uses prepareEvent to build the crash event:

https://github.com/getsentry/sentry-cocoa/blob/fbb76563c954bab9c13baa2b70a1f0962230c97f/Sources/Sentry/SentryClient.m#L662-L665

The prepareEvent then uses the SentryThreadInspector to get the current threads with the stack trace of the current thread using SentryStacktraceBuilder:

https://github.com/getsentry/sentry-cocoa/blob/fbb76563c954bab9c13baa2b70a1f0962230c97f/Sources/Sentry/SentryThreadInspector.m#L90

https://github.com/getsentry/sentry-cocoa/blob/fbb76563c954bab9c13baa2b70a1f0962230c97f/Sources/Sentry/SentryThreadInspector.m#L112-L114

The stacktrace itself is built using sentrycrashsc_initSelfThread, which uses the C-level backtrace provided by the stdlib.

https://github.com/getsentry/sentry-cocoa/blob/fbb76563c954bab9c13baa2b70a1f0962230c97f/Sources/SentryCrash/Recording/Tools/SentryCrashStackCursor_SelfThread.m#L50-L79

@philipphofmann do you think we could change captureException to detect if the errors are NSException and use the callStackReturnAddresses like we do in SentryCrashMonitor_NSException.m?

In any case we need that changing the stacktrace could break issue grouping

philprime avatar Mar 31 '25 09:03 philprime

Issue grouping is an interesting subject, too. Unrelated to this topic, I still get issues reported from versions where I had hang detection enabled, popping up and causing "regressions" on issues I closed. None of them are connected to each other, because the hang traces happen in all sorts of locations. Most of the hanging I wasn't even aware of, because it's probably modal dialogs taking over the event loop. Another one I noticed in action was high resolution mouse wheel scrolling causing parts of the UI to pause updating while the playlist is scrolling in pixel increments. Alas. Subjects for other issues.

(I also just got a regression report from Firebase, for an issue long solved, because someone booted up a version that still had the issue and still used Firebase for reporting. I also finally stopped getting reports from Bugsnag for versions that preceded that switch to Firebase.)

kode54 avatar Mar 31 '25 09:03 kode54

@philipphofmann do you think we could change captureException to detect if the errors are NSException and use the callStackReturnAddresses like we do in SentryCrashMonitor_NSException.m?

Yes, we can do that, but like you said it can break grouping. The safest is to add a feature flag that is disabled by default to enable this behaviour. In the next major we can remove the feature flag and make it the new default behavior.

philipphofmann avatar Apr 07 '25 09:04 philipphofmann

@kode54, to your unrelated topic: This looks like a completely different problem. Please open an extra issue for this.

philipphofmann avatar Apr 07 '25 09:04 philipphofmann

@kode54, to your unrelated topic: This looks like a completely different problem. Please open an extra issue for this.

I already filed an issue on the App Hangs, https://github.com/getsentry/sentry-cocoa/issues/4947 closed as not planned.

kode54 avatar Apr 07 '25 21:04 kode54

@kode54, we closed the other is as not planned, because we will fix the underlying problem with https://github.com/getsentry/sentry-cocoa/issues/4950.

philipphofmann avatar Apr 08 '25 11:04 philipphofmann

Can I get this issue renamed or repurposed? It seems that the actual underlying issue is mistaken traces logged for the thread that reports the exception, causing the trace to end in the exception handler instead of the location where it was thrown.

kode54 avatar Jun 09 '25 23:06 kode54

@kode54, when the underlying issue is something different, I think we should close this issue and open a new one referencing this one.

philipphofmann avatar Jun 18 '25 08:06 philipphofmann

Closing this due to inactivity. If this is still relevant for you, feel free to comment and provide current information, or open an new issue

philprime avatar Nov 18 '25 13:11 philprime