sentry-java icon indicating copy to clipboard operation
sentry-java copied to clipboard

invalid pthread_t 0x7893464cb0 passed to pthread_getcpuclockid

Open crazyzjw opened this issue 1 year ago • 30 comments

❗ Note from the maintainers:

  • we have a reproducer for this which triggers the culprit code path by creating pthreads, attaching them to runtime, and returning (on some paths) without detaching
  • Sentry is not used in this reproducer and as far as our analyses go, Sentry does not cause the above behaviour
  • rather, other dependencies in an app with incorrect detach paths for pthreads are the root cause for the crash in Android Tracer, and Sentry's usage of Android Tracer for Profiling can increase the impact

Integration

sentry-android

Build System

Gradle

AGP Version

7.2.1

Proguard

Disabled

Version

6.15.0

Steps to Reproduce

Occasionally the native layer crashes when switching pages,but it is ok when remove import the sdk if set options.enableTracing=false is will be run

Expected Result

Occasionally the native layer crashes when switching pages,but it is ok when remove import the sdk

Actual Result

Occasionally the native layer crashes when switching pages,but it is ok when remove import the sdk image

image

┆Issue is synchronized with this Jira Improvement by Unito

crazyzjw avatar Mar 15 '23 12:03 crazyzjw

@crazyzjw thanks for reporting! Just to clarify: You see no crashes when enableTracing=false right?

markushi avatar Mar 22 '23 13:03 markushi

@crazyzjw would it be possible to provide a reproducible sample in addition? Do you also use our NDK integration? Does it happen on a specific device?

romtsn avatar Mar 22 '23 14:03 romtsn

@crazyzjw thanks for reporting! Just to clarify: You see no crashes when enableTracing=false right?

yes Because of this problem I temporarily enableTracing=false , no crashes

crazyzjw avatar Mar 23 '23 02:03 crazyzjw

@crazyzjw would it be possible to provide a reproducible sample in addition? Do you also use our NDK integration? Does it happen on a specific device? happen on most devices, i try about 3~4 devices ;ndk version :0.5.4,i think if frequent access to horizontal and vertical screens it be possible to happen

crazyzjw avatar Mar 23 '23 02:03 crazyzjw

@crazyzjw Would it be possible to check if disabling activityFramesTracking option makes the crash gone?

stefanosiano avatar Apr 05 '23 13:04 stefanosiano

@crazyzjw Would it be possible to check if disabling activityFramesTracking option makes the crash gone?

if isEnableAutoActivityLifecycleTracing=false makes the crash gone

crazyzjw avatar Apr 06 '23 09:04 crazyzjw

I encountered the same problem. After I set profilesSampleRate = 1.0, it crashed when using app(occasionally). After I removed profilesSampleRate(tracesSampleRate remains 1.0), no crash happens.

        SentryAndroid.init(application) { options ->
            options.isDebug = BuildConfig.DEBUG
            options.dsn = CoreConfig.SENTRY_DSN
            options.isEnableSystemEventBreadcrumbs = false
            options.isEnableUncaughtExceptionHandler = true
            options.isEnableNdk = true
            options.isAnrEnabled = true
            options.maxRequestBodySize = SentryOptions.RequestSize.MEDIUM
            options.tracesSampleRate = 1.0
            options.profilesSampleRate = 1.0
        }

Sentry Version:6.14.0 - 6.17.0 Crashed only on one device: M2012K11AC.For now others devices work fine. StackTrace:

OS Version: Android 13 (TKQ1.220829.002 test-keys) Report Version: 104

Exception Type: Unknown (SIGABRT)

Application Specific Information: Abort

Thread 0 Crashed: 0 libc.so 0x7c41519674 abort 1 libc.so 0x7c41581aa0 + 533672237728 2 libc.so 0x7c415819b8 + 533672237496 3 libc.so 0x7c41581778 pthread_getcpuclockid 4 libart.so 0x7b95e68140 art::Thread::GetCpuMicroTime 5 libart.so 0x7b95e89590 art::Trace::CompareAndUpdateStackTrace 6 libart.so 0x7b95e8a0a0 + 530796028064 7 libart.so 0x7b95e84b38 art::ThreadList::ForEach 8 libart.so 0x7b95e89df8 art::Trace::RunSamplingThread 9 libc.so 0x7c415814c8 + 533672236232 10 libc.so 0x7c4151aebc + 533671816892

EOF

kasogg avatar Apr 27 '23 03:04 kasogg

Thanks for the feedback @kasogg! We are investigating the issue, and will update you as soon as we have news on this

stefanosiano avatar Apr 27 '23 09:04 stefanosiano

@kasogg @crazyzjw We tried to reproduce the issue but so far couldn't. Ideally we could get some more information about your setup (build/target SDK versions, other dependencies). Or even better have a reproducible example.

markushi avatar May 03 '23 13:05 markushi

@crazyzjw any chance you had a look at this?

markushi avatar May 10 '23 15:05 markushi

@markushi compileSdkVersion : 32, buildToolsVersion : "32.0.0", minSdkVersion : 23, targetSdkVersion : 32,

        appcompat_v7               : 'androidx.appcompat:appcompat:1.4.1',
        activity                   : 'androidx.activity:activity:1.3.1',
        fragment                   : 'androidx.fragment:fragment:1.3.6',
        design                     : 'com.google.android.material:material:1.0.0',
        kotlin_stdlib              : "org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.7.20",
        constraint_layout          : 'androidx.constraintlayout:constraintlayout:1.1.3',
        recycle_view               : 'androidx.recyclerview:recyclerview:1.2.1',
        junit                      : "junit:junit:4.12",
        testRunner                 : 'androidx.test.ext:junit:1.1.1',
        espresso                   : 'androidx.test.espresso:espresso-core:3.1.0',
        ktx                        : 'androidx.core:core-ktx:1.3.2',
        room                       : "androidx.room:room-runtime:$room_version",
        room_compiler              : "androidx.room:room-compiler:2.4.2",
        room_rxjava2               : "androidx.room:room-rxjava2:2.4.2",
        view_model                 : "androidx.lifecycle:lifecycle-viewmodel-ktx:2.3.1",
        lifecycle                  : "androidx.lifecycle:lifecycle-runtime-ktx:2.3.1",
        viewpager2                 : "androidx.viewpager2:viewpager2:1.1.0-alpha01",
        hilt_android               : "com.google.dagger:hilt-android:2.42",
        hilt_compiler              : "com.google.dagger:hilt-android-compiler:2.42",
        transition                 : "androidx.transition:transition:1.4.1",

        compose_activity           : "androidx.activity:activity-compose:1.3.0",
        compose_ui                 : "androidx.compose.ui:ui:1.1.1",
        compose_ui_tooling         : "androidx.compose.ui:ui-tooling:1.1.1",
        compose_runtime_livedata   : "androidx.compose.runtime:runtime-livedata:1.1.1",
        compose_material           : "androidx.compose.material:material:1.1.1",
        compose_animation          : "androidx.compose.animation:animation:1.1.1",
        compose_lifecycle_viewmodel: "androidx.lifecycle:lifecycle-viewmodel-compose:2.5.1",
        compose_appcompat_theme    : "com.google.accompanist:accompanist-appcompat-theme:0.23.1",

kasogg avatar May 11 '23 04:05 kasogg

the problem still exists compileSdkVersion 33 buildToolsVersion "32.0.0" minSdkVersion 21 targetSdkVersion 30

crazyzjw avatar May 12 '23 06:05 crazyzjw

@crazyzjw @kasogg we would appreciate if you can provide a reproducible sample project, because we are unable to reproduce it with the provided details. Thank you in advance!

romtsn avatar May 17 '23 13:05 romtsn

I created a simple project but crash not happened. I'm not sure which code cause the crash.

kasogg avatar May 22 '23 10:05 kasogg

@kasogg in case you find a way to reproduce this - please re-open this issue!

markushi avatar May 24 '23 14:05 markushi

Reported by another cusotmer. They are initializing through React Native init. It seems to occur when profiling is enabled. Similar stacktrace with RunSamplingThread and __pthread_internal_find (see linked Jira for example event)

Their SDK versions:

sentry.javascript.react-native 5.20.0
sentry.native.android.sentry-android 7.5.0
sentry.native.android.react-native 0.7.0

realkosty avatar Apr 25 '24 19:04 realkosty

This seems to be unrelated to React Native - most likely related to Android Profiler used in all of the above scenarios. A (very) quick code search yielded:

  • Android Tracer impl of CompareAndUpdateStackTrace performs an equals check on pthread_self() and the sampling thread
  • pthread_self() most likely ends up in an internal function to resolve the thread, and if that fails, 3 things can happen
    1. on Android versions < 26, a null thread is returned without an error (unsure how this is later processed, but this is not what's happening here)
    2. on Android >= 26, if the provided thread_id == nullptr, logs a warning and also returns null
    3. on Android >= 26, if the provided thread_id != nullptr, logs the message we see in the report and aborts <-- this is what happens here

@stefanosiano @markushi does this make sense?

kahest avatar Apr 30 '24 08:04 kahest

@kahest Should we advise customer to disable profiling for Android >= 26 while this is being investigated? or is it < 26 (seems like a typo)?

Sentry.init({
// ...
  _experiments: {
    // profilesSampleRate is relative to tracesSampleRate.
    profilesSampleRate: Platform.OS === 'android' && Platform.Version >= 26 ? 0.0 : 1.0
  },
});

realkosty avatar May 06 '24 16:05 realkosty

@realkosty sadly not a typo - up until Android 25 this error was ignored. Starting with Android 26 (API level 26 that is, which means Android OS 8.0) this causes a fatal error if the requested thread could not be found.

Deactivating Profiling for >= 26 will likely prevent this exact error. However it's possible that in these exact instances the unfound thread causes other (possibly fatal) errors to occur as the program continues on, just with less context about the root cause. Until we find a way to reproduce this, we can't tell for sure.

After some investigation we did not find a way to reproduce or prevent this. It very much seems like this is a bug in the Android Tracer implementation, but we'll dig in a bit more. We'd appreciate any pointers on how to reproduce this.

kahest avatar May 06 '24 17:05 kahest

@kahest Should we advise customer to disable profiling for Android >= 26 while this is being investigated? or is it < 26 (seems like a typo)?

Sentry.init({
// ...
  _experiments: {
    // profilesSampleRate is relative to tracesSampleRate.
    profilesSampleRate: Platform.OS === 'android' && Platform.Version >= 26 ? 0.0 : 1.0
  },
});

I can confirm this resolved the issue for us on @sentry/[email protected]

marjorg avatar Jun 03 '24 14:06 marjorg

@marjorg thanks for reporting back! FYI @krystofwoldrich @stefanosiano

kahest avatar Jun 03 '24 14:06 kahest

@realkosty since it's hard to get a reproducible, would it be possible to get a tombstone instead? Those are generated after the crash. Using adb bugreport would generate a zip file containing the offending tombstone file.

markushi avatar Jun 19 '24 13:06 markushi

Just updating here that we received a crash dump including tombstone and are analysing the information for clues. We'll keep this updated.

kahest avatar Jun 27 '24 08:06 kahest

A similar issue was reported on SO: https://stackoverflow.com/questions/78684061/using-sentry-on-android-make-app-crash-when-ndk-error-happens

markushi avatar Jul 01 '24 06:07 markushi

A similar issue was reported on SO: https://stackoverflow.com/questions/78684061/using-sentry-on-android-make-app-crash-when-ndk-error-happens get this error: invalid pthread_t 0x71e13cbcb0 passed to pthread_getcpuclockid Fatal signal 6 (SIGABRT), code -1 (SI_QUEUE) in tid 2237 (Sampling Profil), pid 2180(package name) Using enableTracing=false makes the error gone, but setting activityFramesTracking doesn't work, I run the app on Android 13, and a crash happens when we get some NDK errors in another library(UVCCamera)

topxebec avatar Jul 13 '24 18:07 topxebec

A trialing customer reported this happening on Android as well, included more details internally in the associated customer case + Jira ticket.

cstavitsky avatar Jul 25 '24 18:07 cstavitsky

More context: One customer (the above) shared that Google Play Store reports these crashes only for Android 13 and 14 (API 33 and 34) and not below.

kahest avatar Aug 06 '24 16:08 kahest

Quick update: we have a reproducer for this which triggers the culprit code path by creating pthreads, attaching them to runtime, and returning (on some paths) without detaching. Sentry is not used in this reproducer and as far as our analyses go, Sentry does not cause the above behaviour.

Rather, other dependencies in an app with incorrect detach paths for pthreads are the root cause for the crash in Android Tracer, and Sentry's usage of Android Tracer for Profiling can increase the impact.

We're continuing to investigate means to mitigate this problem.

kahest avatar Sep 17 '24 11:09 kahest

there is a somewhat related issue on Google's issue tracker with a discussion on managing/checking own pthreads: https://issuetracker.google.com/issues/114509602

kahest avatar Sep 27 '24 13:09 kahest