sentry-cocoa icon indicating copy to clipboard operation
sentry-cocoa copied to clipboard

feat: Added support for arm64e

Open brustolin opened this issue 1 year ago • 25 comments

:scroll: Description

A developer may choose the arm64e architecture for their iOS app to benefit from enhanced security features. Sentry was not compiling for this architecture due to issues with accessing machine context properties

By using arm_thread_state64_get_* macros, we can read the necessary information.

:bulb: Motivation and Context

closes #3344

:green_heart: How did you test it?

Samples

:pencil: Checklist

You have to check all boxes before merging:

  • [ ] I reviewed the submitted code.
  • [ ] I added tests to verify the changes.
  • [ ] No new PII added or SDK only sends newly added PII if sendDefaultPII is enabled.
  • [ ] I updated the docs if needed.
  • [ ] Review from the native team if needed.
  • [ ] No breaking change or entry added to the changelog.
  • [ ] No breaking change for hybrid SDKs or communicated to hybrid SDKs.

:crystal_ball: Next steps

brustolin avatar Nov 13 '23 09:11 brustolin

Fails
:no_entry_sign:

CHANGELOG.md#L351 - The changelog entry seems to be part of an already released section ## 8.29.1. Consider moving the entry to the ## Unreleased section, please.

Messages
:book: Do not forget to update Sentry-docs with your feature once the pull request gets approved.

Generated by :no_entry_sign: dangerJS against 9085c9784abddbb3025f5e944302b9e7de0aebca

github-actions[bot] avatar Nov 13 '23 09:11 github-actions[bot]

Although the project is now compiling, when we run profiling it crashes, I couldn't figure it out the problem.

@armcknight maybe you would like to take a look at this? Try the iOS13-Swift sample. It needs to run on a real device.

brustolin avatar Nov 13 '23 09:11 brustolin

Performance metrics :rocket:

  Plain With Sentry Diff
Startup time 1227.51 ms 1245.79 ms 18.28 ms
Size 21.58 KiB 418.44 KiB 396.86 KiB

Baseline results on branch: main

Startup times

Revision Plain With Sentry Diff
4977fbc0390d8c3820831c28d902690b059dd69d 1217.26 ms 1239.82 ms 22.56 ms
3a01b17b5d8fcc05e8a5306cbc9706d640daba39 1212.12 ms 1221.80 ms 9.68 ms
3297d6e29cd37151f68e9c605aea8b44e5e432cb 1228.57 ms 1255.04 ms 26.47 ms
5e78d2bda0491ad7949da92051d7ff20013f486f 1229.90 ms 1252.94 ms 23.04 ms
a4231611a441a8aefd6f8e572dd921d38b6b08e5 1234.41 ms 1246.41 ms 12.00 ms
2124551ae0a2da085c69d3f0838e9be646fa4a70 1246.10 ms 1254.84 ms 8.74 ms
3f366ee274433a5eee44c472050832d520036638 1244.49 ms 1257.28 ms 12.79 ms
c5232eadff3e28f984b91f2f5ce629ad7381aad1 1226.06 ms 1246.57 ms 20.50 ms
4afae53402dcdbddb191966eba4f2778310c938c 1217.65 ms 1229.27 ms 11.62 ms
7bb0873cd04a43827b601dce3e8dcdc3477c6c23 1215.65 ms 1235.00 ms 19.35 ms

App size

Revision Plain With Sentry Diff
4977fbc0390d8c3820831c28d902690b059dd69d 20.76 KiB 419.85 KiB 399.10 KiB
3a01b17b5d8fcc05e8a5306cbc9706d640daba39 20.76 KiB 436.33 KiB 415.57 KiB
3297d6e29cd37151f68e9c605aea8b44e5e432cb 21.58 KiB 418.45 KiB 396.86 KiB
5e78d2bda0491ad7949da92051d7ff20013f486f 22.85 KiB 411.17 KiB 388.32 KiB
a4231611a441a8aefd6f8e572dd921d38b6b08e5 22.85 KiB 406.69 KiB 383.85 KiB
2124551ae0a2da085c69d3f0838e9be646fa4a70 22.85 KiB 411.69 KiB 388.84 KiB
3f366ee274433a5eee44c472050832d520036638 20.76 KiB 427.84 KiB 407.08 KiB
c5232eadff3e28f984b91f2f5ce629ad7381aad1 22.85 KiB 413.98 KiB 391.13 KiB
4afae53402dcdbddb191966eba4f2778310c938c 22.84 KiB 402.08 KiB 379.24 KiB
7bb0873cd04a43827b601dce3e8dcdc3477c6c23 22.85 KiB 407.09 KiB 384.24 KiB

Previous results on branch: fix/arm64e-lib__fp

Startup times

Revision Plain With Sentry Diff
3930ded9900795938f6311f3592215d77673e8ad 1211.16 ms 1228.40 ms 17.24 ms

App size

Revision Plain With Sentry Diff
3930ded9900795938f6311f3592215d77673e8ad 22.85 KiB 414.71 KiB 391.86 KiB

github-actions[bot] avatar Nov 13 '23 09:11 github-actions[bot]

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

Project coverage is 91.423%. Comparing base (d8926bf) to head (9085c97). Report is 75 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@              Coverage Diff              @@
##              main     #3398       +/-   ##
=============================================
+ Coverage   91.063%   91.423%   +0.359%     
=============================================
  Files          610       607        -3     
  Lines        47750     48992     +1242     
  Branches         0     17672    +17672     
=============================================
+ Hits         43483     44790     +1307     
+ Misses        4267      4110      -157     
- Partials         0        92       +92     
Files Coverage Δ
Sources/Sentry/SentryProfiler.mm 87.755% <100.000%> (+0.100%) :arrow_up:
Sources/Sentry/include/SentryThreadState.hpp 90.476% <100.000%> (+4.761%) :arrow_up:
...SentryCrash/Recording/Tools/SentryCrashCPU_arm64.c 100.000% <100.000%> (ø)

... and 233 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update d8926bf...9085c97. Read the comment docs.

codecov[bot] avatar Nov 13 '23 09:11 codecov[bot]

Having a look. For reference, here is the crash: image

It's happening when creating a profiler instance when there's already one in memory which gets destructed.

armcknight avatar Nov 14 '23 00:11 armcknight

It's happening when creating a profiler instance when there's already one in memory which gets destructed.

Thanks! Do we have plans to fix this already?

brustolin avatar Nov 15 '23 07:11 brustolin

Sorry for the late reply @brustolin ... we're currently investigating the root cause.

armcknight avatar Nov 17 '23 23:11 armcknight

Any update on this one @armcknight

brustolin avatar Dec 20 '23 14:12 brustolin

@armcknight @brustolin how do I repro this? I'm running on device with the iOS-Swift sample app and starting/ending profiles and I can't get it to crash.

indragiek avatar Dec 27 '23 22:12 indragiek

@armcknight @brustolin how do I repro this? I'm running on device with the iOS-Swift sample app and starting/ending profiles and I can't get it to crash.

Are you running it on a real device? Simulators don't crash. Run the app and start a transaction, it crashes for me.

brustolin avatar Dec 28 '23 09:12 brustolin

@brustolin I can't repro on an iPhone 14 Pro Max running iOS-Swift at the head of this branch. Are you testing with an arm64e or arm64 device? I tried with multiple concurrent transactions as well and everything seems to work fine. @armcknight do you have an older device you can test with?

indragiek avatar Dec 28 '23 20:12 indragiek

@brustolin I can't repro on an iPhone 14 Pro Max running iOS-Swift at the head of this branch. Are you testing with an arm64e or arm64 device? I tried with multiple concurrent transactions as well and everything seems to work fine. @armcknight do you have an older device you can test with?

Ow, sorry. iOS-Swift is not configured to run with arm64e, I must have roll it back. It only crashes when I compile the sample for arm64e. I will commit the change, you should have both options now.

image

brustolin avatar Dec 29 '23 10:12 brustolin

@brustolin Thanks, able to repro now. It appears this is crashing within profiling code any place where shared_ptr is used and free'd (QueueMetadata, ThreadMetadataCache, and SamplingProfiler) so it isn't directly related to any specific piece of code. Replacing shared_ptr with unique_ptr in all of these places fixes the crash but I don't think it's the right way to solve the problem. Still looking into it.

indragiek avatar Dec 30 '23 00:12 indragiek

There is some relevant documentation on Apple's page about arm64e pointer authentication:

Pointer authentication can also expose latent bugs in existing code. In C++, it’s incorrect to call a virtual method using a declaration that differs from its definition. In practice, such calls typically succeed in arm64, but trigger a pointer authentication failure in arm64e. You might encounter this bug when using OS_OBJECT types like dispatch_queue_t and xpc_connection_t. You can’t pass instances of these types from C++ code to an Objective-C++ function (or vice versa) because they’re defined differently in Objective-C++ to support automatic reference counting (ARC).

I suspect something like this is causing the crash.

indragiek avatar Dec 30 '23 00:12 indragiek

I don't think [replacing shared_ptr with unique_ptr in all of these places is] the right way to solve the problem

Why not? We would potentially have to use std::move in places that create new references to these, if any, if my understanding of std::move is correct (based on my reading of https://stackoverflow.com/a/6876833). Are there other considerations?

armcknight avatar Jan 08 '24 20:01 armcknight

@armcknight No doubt we can make the change, I just would not be confident that it actually fixes the issue without understanding why using a shared_ptr here causes a crash, and only on a single architecture. If this is urgent and we need to ship a quick patch then we certainly can do that though.

indragiek avatar Jan 09 '24 02:01 indragiek

@armcknight @indragiek Any progress to solve this crash?

brustolin avatar Jan 31 '24 13:01 brustolin

@brustolin check out the changes Indragie pushed up recently. Do you still see the crash with those?

armcknight avatar Jun 15 '24 00:06 armcknight

Still getting an error at SamplingProfiler::~SamplingProfiler:

image

image

brustolin avatar Jun 18 '24 07:06 brustolin

@brustolin Would you mind pulling the commit I just added and trying again? That was a nasty merge you had to do, and it overwrote some of indragie's changes. Sorry about that. I just tried my test and it no longer crashes. The profiles I created both appeared in the dash, so things appear to still function normally.

armcknight avatar Jun 20 '24 20:06 armcknight

Same error I believe. image

brustolin avatar Jun 21 '24 13:06 brustolin

@brustolin would you mind capturing your repro steps in a UI test so we can eliminate any variation between our machines and see what happens when we run it in CI?

armcknight avatar Jul 24 '24 00:07 armcknight

@armcknight I added a test, but this only crashes when running in the real device (its not even possible to run on a simulator).

brustolin avatar Jul 30 '24 08:07 brustolin

@brustolin Oh, I forgot that you were originally testing with iOS13-Swift. Is there no way to do it in the normal iOS-Swift sample app so we can test it in a simulator? That sample is still set up to build for the arm64e target. Is there something special about iOS13-Swift?

armcknight avatar Aug 02 '24 04:08 armcknight

The problem only occurs when the target is arm64e. This is the whole issue.

I don't want to change iOS-Swift because of this; we would have to test on real devices anyway.

brustolin avatar Aug 02 '24 07:08 brustolin