ndk
ndk copied to clipboard
[BUG] ndk-stack does not accommodate for the difference in relative PC computation between Android versions
Description
Different Android versions compute relative PC printed in backtrace in different ways. Here are examples of the very same crash (same APK) running on three different devices with three different Android versions:
Android 8.1 on Pixel 2
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/walleye/walleye:8.1.0/OPM2.171026.006.G3/5513837:user/release-keys'
Revision: 'MP1'
ABI: 'arm'
pid: 10616, tid: 10669, name: 1.ui >>> io.flutter.examples.hello_world <<<
signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
r0 00000000 r1 000029ad r2 00000006 r3 00000008
r4 00002978 r5 000029ad r6 c89ff08c r7 0000010c
r8 00000000 r9 c9cc0ac1 sl dc026400 fp c8a0011c
ip c8ad651b sp c89ff078 lr e5a33c31 pc e5a2d782 cpsr 200e0030
backtrace:
#00 pc 0001a782 /system/lib/libc.so (abort+63)
#01 pc 002c9b91 /data/app/io.flutter.examples.hello_world-tiqKsqQ08yBXU2hWODwfTA==/lib/arm/libflutter.so (offset 0xff5000)
Android 10 on Pixel 4a
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/flame/flame:10/QQ3A.200805.001/6578210:user/release-keys'
Revision: 'MP1.0'
ABI: 'arm'
Timestamp: 2020-10-20 13:47:08-0700
pid: 20536, tid: 20586, name: 1.ui >>> io.flutter.examples.hello_world <<<
uid: 10222
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
r0 00000000 r1 0000506a r2 00000006 r3 c2132878
r4 c213288c r5 c2132870 r6 00005038 r7 0000016b
r8 c2132888 r9 c2132878 r10 c21328a8 r11 c2132898
ip 0000506a sp c2132848 lr ebe2c6e3 pc ebe2c6f6
backtrace:
#00 pc 0005f6f6 /apex/com.android.runtime/lib/bionic/libc.so (abort+166) (BuildId: 8c3173001a99af3ab544de85a610e066)
#01 pc 012beb91 /data/app/io.flutter.examples.hello_world-8nGxY8_VmIDo8hf0WEUzUQ==/lib/arm/libflutter.so (BuildId: f3226de58c8d62b2de4d5f7b4066c4a9c0f07b4e)
#02 pc 65000000 <unknown>
Android 11 on Pixel 3a
*** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
Build fingerprint: 'google/sargo/sargo:11/RP1A.201005.004/6782484:userdebug/dev-keys'
Revision: 'MP1.0'
ABI: 'arm'
Timestamp: 2020-10-22 09:28:26+0200
pid: 13676, tid: 13705, name: 1.ui >>> io.flutter.examples.hello_world <<<
uid: 10274
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
r0 00000000 r1 00003589 r2 00000006 r3 c7a87808
r4 c7a8781c r5 c7a87800 r6 0000356c r7 0000016b
r8 c7a87808 r9 c7a87818 r10 c7a87838 r11 c7a87828
ip 00003589 sp c7a877d8 lr f30433e1 pc f30433f4
backtrace:
#00 pc 000383f4 /apex/com.android.runtime/lib/bionic/libc.so (abort+172) (BuildId: 09f5dc86ced902a66ebda24ea42c217d)
#01 pc 012bfb91 /data/app/~~sGzta02j0vlFNEgy7PjzQA==/io.flutter.examples.hello_world-HrS9T-azBIoKi_uwl9sUkQ==/lib/arm/libflutter.so (BuildId: f3226de58c8d62b2de4d5f7b4066c4a9c0f07b4e)
#02 pc 66000000 <unknown>
Out of all three reports only the last one from Android 11 would symbolise correctly using ndk-stack
:
$ ~/android-ndk-r21d/ndk-stack -sym . < crashes.txt
********** Crash dump: **********
Build fingerprint: 'google/walleye/walleye:8.1.0/OPM2.171026.006.G3/5513837:user/release-keys'
#00 0x0001a782 /system/lib/libc.so (abort+63)
#01 0x002c9b91 /data/app/io.flutter.examples.hello_world-tiqKsqQ08yBXU2hWODwfTA==/lib/arm/libflutter.so (offset 0xff5000)
??
??:0:0
Crash dump is completed
********** Crash dump: **********
Build fingerprint: 'google/flame/flame:10/QQ3A.200805.001/6578210:user/release-keys'
#00 0x0005f6f6 /apex/com.android.runtime/lib/bionic/libc.so (abort+166) (BuildId: 8c3173001a99af3ab544de85a610e066)
#01 0x012beb91 /data/app/io.flutter.examples.hello_world-8nGxY8_VmIDo8hf0WEUzUQ==/lib/arm/libflutter.so (BuildId: f3226de58c8d62b2de4d5f7b4066c4a9c0f07b4e)
dart::DN_HelperInternal_makeListFixedLength(dart::Isolate*, dart::Thread*, dart::Zone*, dart::NativeArguments*)
/usr/local/google/home/vegorov/src/flutter/engine/src/out/android_debug/../../third_party/dart/runtime/lib/growable_array.cc:84:3
dart::BootstrapNatives::DN_Internal_makeListFixedLength(dart::Thread*, dart::Zone*, dart::NativeArguments*)
/usr/local/google/home/vegorov/src/flutter/engine/src/out/android_debug/../../third_party/dart/runtime/lib/growable_array.cc:83:1
#02 0x65000000 <unknown>
Crash dump is completed
********** Crash dump: **********
Build fingerprint: 'google/sargo/sargo:11/RP1A.201005.004/6782484:userdebug/dev-keys'
#00 0x000383f4 /apex/com.android.runtime/lib/bionic/libc.so (abort+172) (BuildId: 09f5dc86ced902a66ebda24ea42c217d)
#01 0x012bfb91 /data/app/~~sGzta02j0vlFNEgy7PjzQA==/io.flutter.examples.hello_world-HrS9T-azBIoKi_uwl9sUkQ==/lib/arm/libflutter.so (BuildId: f3226de58c8d62b2de4d5f7b4066c4a9c0f07b4e)
dart::DN_HelperObject_dumpStack(dart::Isolate*, dart::Thread*, dart::Zone*, dart::NativeArguments*)
/usr/local/google/home/vegorov/src/flutter/engine/src/out/android_debug/../../third_party/dart/runtime/lib/object.cc:130:5
dart::BootstrapNatives::DN_Object_dumpStack(dart::Thread*, dart::Zone*, dart::NativeArguments*)
/usr/local/google/home/vegorov/src/flutter/engine/src/out/android_debug/../../third_party/dart/runtime/lib/object.cc:100:1
#02 0x66000000 <unknown>
This is not surprising: ndk-stack
simply passes PCs it extracts from crash dumps as is into llvm-symbolizer
(or addr2line
). From what I can see this always was the behaviour (even when it was implemented as a C program). Both of these tools expect VMAs - however only Android 11 prints correct VMA.
Android 10 is off by 0x1000 (seems to be load bias - difference between .text section file offset and VMA): 012beb91 - 012bfb91 = 0x1000
.
Android 8.1 seems to print offset into RX section - which is off from PC VMA by .text section VMA aligned down to the page size:
$ ~/android-ndk-r21d/toolchains/llvm/prebuilt/darwin-x86_64/bin/x86_64-linux-android-readelf -l libflutter.so | grep 'R E'
LOAD 0xff57c0 0x00ff67c0 0x00ff67c0 0x53f870 0x53f870 R E 0x1000
Observe that 0x00ff67c0 & ~0xFFF = 0xff6000
and 0xff6000 + 002c9b91 = 0x12bfb91
.
For your convenience this archive (shared with Google only) contains both crashing APK and a library with debugging information.
I suspect this might have been unnoticed over the years because GCC and LLVM lay out binaries in a slightly different way, so things might have worked okay with GCC and got broken with LLVM binaries.
Environment Details
Not all of these will be relevant to every bug, but please provide as much information as you can.
- NDK Version: 21.3.6528147
- Build system:
- Host OS: Mac
- ABI:
- NDK API level:
- Device API level:
For the sake of completeness here is correction logic I ended up implementing myself to handle the difference (because I discovered that I can't rely on ndk-stack
): https://dart-review.googlesource.com/c/dart_ci/+/168960/4/github-label-notifier/symbolizer/lib/symbolizer.dart#292
computePCBias: (frames) async {
if ((androidMajorVersion ?? 0) >= 11) {
return 0;
}
// Prior to Android 11 backtraces printed by debuggerd contained PCs
// which can't be directly used for symbolization. Very old versions
// of Android printed offsets into RX mapping, while newer versions
// printed ELF file offsets. We try to differentiate between these two
// situations by checking if any PCs are outside of .text section range.
// In both cases we can't directly use this PC for symbolization because
// it does not necessarily match VMAs used in the ELF file (which is
// what llvm-symbolizer would need for symbolization).
final textSection = await ndk.getTextSectionInfo(flutterSo);
final textStart = textSection.fileOffset;
final textEnd = textSection.fileOffset + textSection.fileSize;
final likelySectionOffset =
!frames.every((f) => textStart <= f.pc && f.pc < textEnd);
return likelySectionOffset
? (textSection.virtualAddress & ~0xfff)
: (textSection.virtualAddress - textSection.fileOffset);
}
- If Android major version (if present in
Build fingerprint
) is 11 or above then no correction is necessary. - Otherwise try to check if all PCs can be interpreted as file offsets into .text section. If yes - apply load bias, if not treat them as .text relative and apply VMA bias instead.
The unwinder in Android 8 was simply broken so it doesn't unwind correctly when using the latest linker.
The unwinder in Android 10 has a bug that could cause it to have the wrong relative pc, which is probably what you are seeing.
https://android.googlesource.com/platform/system/unwinding/+/master/libunwindstack/AndroidVersions.md
This document describes known bugs and the versions in which they are present. It also includes how to avoid these bugs on older versions of Android if you so choose.
There is the possibility we could modify ndk-stack to recognize this is an older version and try and modify the relative pc based on known issues. I'm not sure if that would work in all cases, but it's probably worth trying.
android.googlesource.com/platform/system/unwinding/+/master/libunwindstack/AndroidVersions.md
Oh, this is the sort of the document I was missing - would save me a lot of time figuring these myself by observation.
There is the possibility we could modify ndk-stack to recognize this is an older version and try and modify the relative pc based on known issues. I'm not sure if that would work in all cases, but it's probably worth trying.
FWIW you could consider at least adding a warning to ndk-stack
output when it detects old Android version which might be affected by the bug if not make it to correct for bugs in older versions. I used to be very puzzled by ndk-stack
output before and only recently got enough time to figure this out.
This is no longer high priority issue for me (as I have build my own tooling to work around the difference) but it might be still confusing to some other people.
(i've added a link from the bionic docs to the unwinder docs: https://android-review.googlesource.com/c/platform/bionic/+/1482722)
Can I ask if this is the problem I'm seeing? On my samsung gs5 I can debug fine using 'adb logcat | ndk-stack -sym' but on a motorola g g5 plus, it tells me:
WARNING: Mismatched build id for obj\symbols\arm64-v8a\libAnthracite.so WARNING: Expected 7dec706e65f69a669adaa7dd61f98879a519eacc WARNING: Found 87c3a641e9308a0e8df2af86f1b199692fa9baf2
Are there any known workaround I can use?
To answer the previous comment, the error you are seeing is that there is a mismatch of the shared library libAnthracite.so. You are trying to get symbol information for two different versions of that library. The version that is on the motorola g g5 plus does not match the version that you are telling ndk-stack to use.
What you are describing is not relevant to this particular bug.