appcenter-sdk-apple icon indicating copy to clipboard operation
appcenter-sdk-apple copied to clipboard

Some exceptions thrown on main thread are swallowed by AppKit

Open depth42 opened this issue 4 years ago • 26 comments

With appcenter v2.5.3, getting information for exceptions on the main thread has been greatly improved. Thanks for this! But we are still seing some exception crashes for which we do not get exception backtraces, like this one:

AppKit              -[NSApplication _crashOnException:]
AppKit              +[CATransaction(NSCATransaction) NS_setFlushesWithDisplayLink]
AppKit              ___NSRunLoopObserverCreateWithHandler_block_invoke
CoreFoundation      __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
CoreFoundation      __CFRunLoopDoObservers
CoreFoundation      __CFRunLoopRun
CoreFoundation      CFRunLoopRunSpecific
HIToolbox           RunCurrentEventLoopInMode
HIToolbox           ReceiveNextEventCommon
HIToolbox           _BlockUntilNextEventMatchingListInModeWithFilter
AppKit              _DPSNextEvent
AppKit              -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]
AppKit              -[NSApplication run]
AppKit              NSApplicationMain
Merlin Project 6    main main.m:17
libdyld.dylib       start

Is there any chance to also catch these?

depth42 avatar Jan 21 '20 13:01 depth42

Hi there! Currently, we report all available information. Could you please confirm what information exactly is lost and how to reproduce this kind of behavior? There's not much we can do having so little info.

annakocheshkova avatar Jan 22 '20 08:01 annakocheshkova

Sadly, we only get these reports from our customers and are currently not able to reproduce these crashes. But I think, I understand, why appcenter currently is not able to gather information, when exceptions occur during +[CATransaction NS_setFlushesWithDisplayLink]. Exceptions which occur there are caught by Apple and forwarded to the private method -[NSApplication _crashOnException:] instead of -[NSApplication reportException:]. Appcenter only seems to swizzle the latter method and therefore misses the first one.

depth42 avatar Jan 27 '20 09:01 depth42

I looked a bit deeper into this. It seems that appcenter is somehow dropping the "Application Specific Backtrace" information from the raw crash logs. This information is added inside -[NSApplication _crashOnException:] by calling _NSNoteInCrashReports. There is a similar issue being reported for iOS: https://github.com/microsoft/appcenter/issues/857

You can simply reproduce this by throwing an exception from within the -drawRect: method of an NSView subclass. The system created crash log contains the application specific backtrace, but the raw crash log displayed by appcenter doesn't.

depth42 avatar Jan 27 '20 10:01 depth42

Hi @depth42, Yes, "Application Specific Information" is not collected now. But it's a feature request, not a bug (I've fixed the labels in the linked issue). And it's a different thing - this error message is produced by some system low level frameworks instead of throwing an exception. I made a draft PR into underlying library for collecting this info, but it's not ready (and no ETA) at the moment.

In your case it is the exception. Thanks for the repro steps! It needs to be investigated, but probably, we can try to override nextEventMatchingEventMask:untilDate:inMode:dequeue: (or something) and wrap super call into @try/@catch (you can try to do in manually now, similar to sendEvent: as documented here).

BTW, I cannot find any info about NSNoteInCrashReports, where you managed to find it?

MatkovIvan avatar Jan 27 '20 12:01 MatkovIvan

@MatkovIvan _NSNoteInCrashReports seems to be a private function which is called from within -[NSApplication _crashOnException:]

I don't think that there is a good public API point to catch the exceptions thrown in -drawRect: . The exceptions are caught further up in the call stack by +[CATransaction NS_setFlushesWithDisplayLink] ( or +[CATransaction NS_setFlushesWIthDisplayRefresh] before macOS 10.15) and are sent directly to -[NSApplication _crashOnException:]

I think it is a bug on Apple's side that they directly call -[NSApplication _crashOnException:] without going through the public -[NSApplication reportException:] first.

It looks like swizzling this private method is the only way to go right now.

depth42 avatar Jan 27 '20 13:01 depth42

I've experimented a bit with this case and managed to send it to App Center. But for now, I can suggest only this dirty hack:

Define _crashOnException: in your NSApplication class

- (void)_crashOnException:(NSException *)exception {
  [MSCrashes applicationDidReportException:exception];
  abort();
}

abort is used here to avoid infinite recursion. It doesn't report the exception properly to system reporter, but send the exception details to App Center just fine. We'll see what we can do in the SDK to catch this. Leaving this issue open as feature request.

MatkovIvan avatar Jan 28 '20 13:01 MatkovIvan

@MatkovIvan how do you replace NSApplication._crashOnException implementation with yours? Can you please provide a more complete example? Sorry, coming from Swift world and basic Objective-C knowledge here. :) Thanks!

fbarbat avatar Jul 28 '20 23:07 fbarbat

Create a NSApplication subclass and assign it to the Principal Class (NSPrincipalClass) key. But it is a private method and this workaround was made only as an experiment.

AnastasiaKubova avatar Jul 29 '20 07:07 AnastasiaKubova

Thanks for your suggestion @AnastasiaKubova! However, I tried that but it isn't working for me. I understood that the run loop when dispatching to CATransaction was calling NSApplication directly ignoring what's declared on "Principal class" for some specific cases. Is that correct? My question was about how to replace the NSApplication method in place without extending it. @depth42 mentioned swizzling it, for example.

ghost avatar Jul 29 '20 14:07 ghost

I re-checked and it works for me as expected. Please note that this workaround was suggested as just an experiment and it should not be used in production because it uses a private API.

AnastasiaKubova avatar Jul 30 '20 08:07 AnastasiaKubova

Any ETA on adding the app info to AppCenter?

I've been getting 67 reports in the past 3 days which seem to belong to this ticket scenario. Here is a crash report section to illustrate:

Exception Type:  SIGILL
Exception Codes: ILL_NOOP at 0x0
Crashed Thread:  0

Thread 0 Crashed:
0   AppKit                               0x00007fff37022b43 -[NSApplication _crashOnException:] + 106
1   AppKit                               0x00007fff36de07ac __62+[CATransaction(NSCATransaction) NS_setFlushesWithDisplayLink]_block_invoke + 804
2   AppKit                               0x00007fff374ff850 ___NSRunLoopObserverCreateWithHandler_block_invoke + 40
3   CoreFoundation                       0x00007fff399cb3c5 __CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__ + 22
4   CoreFoundation                       0x00007fff399cb2f7 __CFRunLoopDoObservers + 456
5   CoreFoundation                       0x00007fff399ca895 __CFRunLoopRun + 873
6   CoreFoundation                       0x00007fff399c9ece CFRunLoopRunSpecific + 461
7   HIToolbox                            0x00007fff385f8abd RunCurrentEventLoopInMode + 291
8   HIToolbox                            0x00007fff385f87d5 ReceiveNextEventCommon + 583
9   HIToolbox                            0x00007fff385f8579 _BlockUntilNextEventMatchingListInModeWithFilter + 63
10  AppKit                               0x00007fff36c40829 _DPSNextEvent + 882
11  AppKit                               0x00007fff36c3f070 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1351
12  AppKit                               0x00007fff36c30d7e -[NSApplication run] + 657
13  AltTab                               0x000000010a26c70e main (main.swift:3)
14  libdyld.dylib                        0x00007fff7382dcc9 start + 0

lwouis avatar Aug 20 '20 08:08 lwouis

@lwouis unfortunately, we don't have an ETA for it yet.

russelarms avatar Aug 20 '20 14:08 russelarms

I'm seeing the same thing - a lot of crashes with the exact same stack strace but but not telling me anything about the actual crash :(

guidedways avatar Sep 19 '20 22:09 guidedways

Hey, for now there's no ETA we can provide yet. The only thing I can suggest - can you check whether @MatkovIvan's workaround above helps?

Jamminroot avatar Sep 21 '20 12:09 Jamminroot

The initial issue here is that some exceptions are swallowed by AppKit and cannot be caught without overriding private methods.

Marking it as known issue instead of feature request because there is no known way to deal with it without accessing to private API and we're not going to add private API usage to release to avoid submitting to store problems.

MatkovIvan avatar Sep 21 '20 12:09 MatkovIvan

As a side note, Crashlytics manages this well. Their crash logs capture all such "swallowed" exceptions. I've now had to switch to crashlytics and am using AppCenter for its analytics support which crashlytics doesn't support on the Mac. Would love to switch back once this can be fixed.

guidedways avatar Sep 21 '20 15:09 guidedways

Hm, interesting 🤔

@guidedways could you please share some reproduction sample of this crash type? Maybe it's a different case with similar symptoms or so.

MatkovIvan avatar Sep 21 '20 19:09 MatkovIvan

@MatkovIvan So here's an example you may try and use.

Basically somehow one of our apps were crashing occasionally in this code:

      NSScrollView *m_scroller;
      CGFloat someOffset;
      ...
      NSClipView *clipView = [m_scroller contentView];
      NSView *documentView = [m_scroller documentView];
      NSRect visRect = [clipView documentVisibleRect];
       [documentView scrollPoint:NSMakePoint(0.0, someOffset)];

Occasionally, don't know how, but someOffset would get set to infinity (some value > DBL_MAX) and the app would crash. App Center wasn't catching these exceptions, possibly because this was happening in the UI runloop. Here's what AppCenter reported:

AppKit
-[NSApplication _crashOnException:]
AppKit
+[CATransaction(NSCATransaction) NS_setFlushesWithDisplayLink]
AppKit
___NSRunLoopObserverCreateWithHandler_block_invoke
CoreFoundation
__CFRUNLOOP_IS_CALLING_OUT_TO_AN_OBSERVER_CALLBACK_FUNCTION__
CoreFoundation
__CFRunLoopDoObservers
CoreFoundation
__CFRunLoopRun

But finally after switching to Crashlytics, this is what we found:

Fatal Exception: NSInternalInconsistencyException
Invalid parameter not satisfying: isfinite(newSize.height)

0  CoreFoundation                 0x7fff383d6b47 (Missing)
1  libobjc.A.dylib                0x7fff710865bf (Missing)
2  CoreFoundation                 0x7fff383ffd08 (Missing)
3  Foundation                     0x7fff3aaf1e9d (Missing)
4  AppKit                         0x7fff35603b8e (Missing)
5  OurApp                        0x102e0f025 -[MonthGridCell setScrollerValue] + 1611 (MonthGridCell.m:1611)
6  OurApp                        0x102e0f8c7 -[MonthGridCell showScroller] + 1728 (MonthGridCell.m:1728)
7  OurApp                        0x102dba588 -[GridView mouseEntered:] + 3336 (GridView.m:3336)

So we were finally able to avoid this crash by checking for if (value > DBL_MAX) { value = 0; }

Interestingly, the stack trace from Crashlytics also has the following, suggesting that the actual Crash was elsewhere; so the "work around" posted above is possibly what Google is doing, except it doesn't require effort on our part, it just works, you can tell by the FIRCLSNSApplicationReportException method in the stack trace. They seem to be using a barrier sync DispatchQueue which means they don't have to separately call abort() as it will guarantee it's run once:

Crashed: com.google.firebase.crashlytics.mac.exception
0  OurApp                        0x1029e2096 FIRCLSProcessRecordAllThreads + 392 (FIRCLSProcess.c:392)
1  OurApp                        0x1029e2487 FIRCLSProcessRecordAllThreads + 423 (FIRCLSProcess.c:423)
2  OurApp                        0x1029d8d44 FIRCLSHandler + 34 (FIRCLSHandler.m:34)
3  OurApp                        0x1029d4d30 __FIRCLSExceptionRecord_block_invoke + 218 (FIRCLSException.mm:218)
4  libdispatch.dylib              0x7fff721d5658 _dispatch_client_callout + 8
5  libdispatch.dylib              0x7fff721e16ec _dispatch_lane_barrier_sync_invoke_and_complete + 60
6  OurApp                        0x1029d43ca FIRCLSExceptionRecord + 225 (FIRCLSException.mm:225)
7  OurApp                        0x1029d472c FIRCLSExceptionRecordNSException + 111 (FIRCLSException.mm:111)
8  OurApp                        0x1029d40fa FIRCLSNSApplicationReportException(objc_object*, objc_selector*, NSException*) + 400 (FIRCLSException.mm:400)
9  AppKit                         0x7fff355bb640 -[NSApplication run] + 836
10 AppKit                         0x7fff3558d396 NSApplicationMain + 777
11 libdyld.dylib                  0x7fff7222ecc9 start + 1

guidedways avatar Sep 21 '20 22:09 guidedways

@guidedways I looked at this case, let me explain what's going on here. The exception message that you see in Crashlytics report comes from "Application Specific Information" data that I mentioned above. It's not related to replacing _crashOnException method at all.

so the "work around" posted above is possibly what Google is doing

No, they use just [NSApplication reportException:], App Center SDK catches/overwrites this automatically too (more than, we take care of not only this unlike Crashlytics).

They seem to be using a barrier sync DispatchQueue which means they don't have to separately call abort() as it will guarantee it's run once

No, it is there for another purpose.

So, your request is more about https://github.com/microsoft/appcenter/issues/857. This will help in the cases like yours.

MatkovIvan avatar Sep 30 '20 13:09 MatkovIvan

@MatkovIvan thank you! Would be then great if #857 could be implemented as it would help for sure.

guidedways avatar Sep 30 '20 13:09 guidedways

I've experimented a bit with this case and managed to send it to App Center. But for now, I can suggest only this dirty hack:

Define _crashOnException: in your NSApplication class

- (void)_crashOnException:(NSException *)exception {
  [MSCrashes applicationDidReportException:exception];
  abort();
}

abort is used here to avoid infinite recursion. It doesn't report the exception properly to system reporter, but send the exception details to App Center just fine. We'll see what we can do in the SDK to catch this. Leaving this issue open as feature request.

@MatkovIvan I am about to implement this workaround. You mentioned that you call abort() to avoid an infinite recursion; would calling [super _crashOnException:exception] cause such a recursion? My understanding is that that should simply call the NSApplication implementation, which should then crash "properly", forwarding the exception to the OS. Please let me know if I am missing something.

MrMage avatar Oct 26 '21 14:10 MrMage

Hi, @MrMage ! Thanks for getting in touch with us! Unfortunately, I don’t have full context about this workaround and I'm not sure that I can answer your question. Also please pay attention that this code snippet was suggested as an experiment and not recommended to use due to using private API.

AnastasiaKubova avatar Oct 27 '21 15:10 AnastasiaKubova

Hi! I see you found workaround for macOS, but what about for iOS? We also have that meaningless stack traces:

Last Exception Backtrace: 0 CoreFoundation 0x00000002046efef8 __exceptionPreprocess + 228 1 libobjc.A.dylib 0x00000002038bda40 objc_exception_throw + 52 2 UIKitCore 0x0000000230f43514 -[UIViewController _presentViewController:withAnimationController:completion:] + 4808 3 UIKitCore 0x0000000230f45bac __63-[UIViewController _presentViewController:animated:completion:]_block_invoke + 100 4 UIKitCore 0x0000000230f5f398 -[_UIViewControllerTransitionCoordinator _applyBlocks:releaseBlocks:] + 268 5 UIKitCore 0x0000000230f5b53c -[_UIViewControllerTransitionContext _runAlongsideCompletions] + 136 6 UIKitCore 0x0000000230f5b214 -[_UIViewControllerTransitionContext completeTransition:] + 128 7 UIKitCore 0x000000023198b414 -[UIViewAnimationBlockDelegate _didEndBlockAnimation:finished:context:] + 740 8 UIKitCore 0x0000000231961490 -[UIViewAnimationState sendDelegateAnimationDidStop:finished:] + 308 9 UIKitCore 0x0000000231961a7c -[UIViewAnimationState animationDidStop:finished:] + 292 10 UIKitCore 0x0000000231961b1c -[UIViewAnimationState animationDidStop:finished:] + 452 11 QuartzCore 0x0000000208ce3394 CA::Layer::run_animation_callbacks(void*) + 280 12 libdispatch.dylib 0x0000000204128484 _dispatch_client_callout + 12 13 libdispatch.dylib 0x00000002040d49ec _dispatch_main_queue_callback_4CF$VARIANT$mp + 1064 14 CoreFoundation 0x000000020467e1bc CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE + 8 15 CoreFoundation 0x0000000204679084 __CFRunLoopRun + 1960 16 CoreFoundation 0x00000002046785b8 CFRunLoopRunSpecific + 432 17 GraphicsServices 0x00000002068ec584 GSEventRunModal + 96 18 UIKitCore 0x00000002314f4bc8 UIApplicationMain + 208 19 myPORT 0x0000000100a3f1ac main (main.m:22) 20 libdyld.dylib 0x0000000204138b94 start + 0

Thread 0 Crashed: 0 libsystem_kernel.dylib 0x0000000204285104 __pthread_kill + 8 1 libsystem_c.dylib 0x00000002041dcd78 abort + 136 2 myPORT 0x0000000100e31804 uncaught_exception_handler.cold.1 + 24 3 myPORT 0x0000000100db5c78 uncaught_exception_handler (PLCrashReporter.m:366) 4 CoreFoundation 0x00000002046f0234 __handleUncaughtException + 688 5 libobjc.A.dylib 0x00000002038bde3c _objc_terminate() + 108 6 myPORT 0x0000000100da9af0 MSACCrashesUncaughtCXXTerminateHandler() (MSACCrashesCXXExceptionHandler.mm:161) 7 libc++abi.dylib 0x00000002038b10fc std::__terminate(void (*)()) + 12 8 libc++abi.dylib 0x00000002038b1188 std::terminate() + 80 9 libdispatch.dylib 0x0000000204128498 _dispatch_client_callout + 32 10 libdispatch.dylib 0x00000002040d49ec _dispatch_main_queue_callback_4CF$VARIANT$mp + 1064 11 CoreFoundation 0x000000020467e1bc CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE + 8 12 CoreFoundation 0x0000000204679084 __CFRunLoopRun + 1960 13 CoreFoundation 0x00000002046785b8 CFRunLoopRunSpecific + 432 14 GraphicsServices 0x00000002068ec584 GSEventRunModal + 96 15 UIKitCore 0x00000002314f4bc8 UIApplicationMain + 208 16 myPORT 0x0000000100a3f1ac main (main.m:22) 17 libdyld.dylib 0x0000000204138b94 start + 0

vlshcherbakov avatar Dec 06 '21 19:12 vlshcherbakov

Hi @vlshcherbakov , the workaround mentioned above is valid for macOS only. It doesn't look like there is anything like [NSApplication _crashOnException:] on iOS. Your stack trace looks different from the original issue and it contains some useful information. You can try to add some telemetry in places of navigation in your app, log the state and some details and send to App Center using Analytics.TrackEvent. These events will be linked to the crash by a session id.

There is also a chance that "Application specific information" (https://github.com/microsoft/appcenter/issues/857) will help in your case, but I can't confirm it.

DmitriyKirakosyan avatar Dec 07 '21 22:12 DmitriyKirakosyan

we are still seing some exception crashes for which we do not get exception backtraces, like this one:

Crashed: com.apple.main-thread 0 AppKit 0x31a728 -[NSApplication _crashOnException:] + 240 1 AppKit 0x31a500 -[NSApplication reportException:] + 440 2 AppKit 0x2d02c -[NSApplication run] + 640 3 AppKit 0x43cc NSApplicationMain + 880 4 AppKit 0x25cb78 _NSApplicationMainWithInfoDictionary + 22 5 UIKitMacHelper 0x4960 UINSApplicationMain + 988 6 UIKitCore 0x38e8 UIApplicationMain + 148 7 appname 0x6fd8 main + 32 (main.m:32) 8 ??? 0x1985a3f28 (Missing)

anilMakhijaEnphase avatar Nov 21 '23 06:11 anilMakhijaEnphase

Some exception crashes for which we do not get exception backtraces, like this one:

Fatal Exception: NSInternalInconsistencyException 0 CoreFoundation 0xf2564 (Missing UUID 47e4ec098f6e30a899d034024d4f8122) 1 libobjc.A.dylib 0x19eb4 (Missing UUID 9bab95567a2a30a8acde010ba8e2367d) 2 Foundation 0x82940 (Missing UUID 9558b1ebdda33fda88a5e785ecdfcd30) 3 Foundation 0xe0ee0 (Missing UUID 9558b1ebdda33fda88a5e785ecdfcd30) 4 Foundation 0x75af30 (Missing UUID 9558b1ebdda33fda88a5e785ecdfcd30) 5 AppKit 0x7cbf8 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 6 AppKit 0xad154c (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 7 AppKit 0x31e5b4 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 8 AppKit 0x33e1c0 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 9 AppKit 0xde1a74 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 10 AppKit 0x347e4 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 11 AppKit 0xde159c (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 12 AppKit 0xde0f54 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 13 AppKit 0xde112c (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 14 AppKit 0x7a695c (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 15 AppKit 0x7a6ce4 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 16 SkyLight 0xb134 (Missing UUID 67c718b452dc3b258e7a12ea2b414369) 17 SkyLight 0x3a0230 (Missing UUID 67c718b452dc3b258e7a12ea2b414369) 18 SkyLight 0x3a012c (Missing UUID 67c718b452dc3b258e7a12ea2b414369) 19 libdispatch.dylib 0x1cb8 (Missing UUID a53d555df748301083fe385c660a81bd) 20 libdispatch.dylib 0x3910 (Missing UUID a53d555df748301083fe385c660a81bd) 21 libdispatch.dylib 0x11fa8 (Missing UUID a53d555df748301083fe385c660a81bd) 22 libdispatch.dylib 0x11bc0 (Missing UUID a53d555df748301083fe385c660a81bd) 23 CoreFoundation 0xbeecc (Missing UUID 47e4ec098f6e30a899d034024d4f8122) 24 CoreFoundation 0x7c7d0 (Missing UUID 47e4ec098f6e30a899d034024d4f8122) 25 CoreFoundation 0x7b9ac (Missing UUID 47e4ec098f6e30a899d034024d4f8122) 26 HIToolbox 0x30448 (Missing UUID e2187dfe1fb43b47874ea0a6291e51b2) 27 HIToolbox 0x30284 (Missing UUID e2187dfe1fb43b47874ea0a6291e51b2) 28 HIToolbox 0x2ffdc (Missing UUID e2187dfe1fb43b47874ea0a6291e51b2) 29 AppKit 0x398a4 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 30 AppKit 0x813980 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 31 AppKit 0x2cd50 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 32 AppKit 0x4014 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 33 AppKit 0x2573a4 (Missing UUID f3527312e4263f7cb77b2bf49d1b7c04) 34 UIKitMacHelper 0x46a8 (Missing UUID f03f3f95a13739e2b894c76dc6e8a1e5) 35 UIKitCore 0x3cfc (Missing UUID 2aebcc42a1aa3535be531da03d4d2ce3) 36 Enlighten 0x70e8 main + 32 (main.m:32) 37 ??? 0x183ef50e0 (Missing)

anilMakhijaEnphase avatar Feb 15 '24 06:02 anilMakhijaEnphase

As we do not have plans to fix this bug in the next year, I'm closing the issue.

DmitriyKirakosyan avatar Apr 15 '24 05:04 DmitriyKirakosyan