john
john copied to clipboard
SunMD5 format crashes on ARM
This is macOS on ARM:
It's more about documenting, but maybe someone can reproduce the issue and find a misuse of 'some' API by the format.
Note: it is a test-full
run.
Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (4xOMP) run_tests.sh: line 62: 13013 Abort trap: 6
$JTR_BIN -test-full=0 --format=cpu
Exit status: 134
Everything else looks ok:
All 402 formats passed self-tests!
Ok: -test-full=0 --format=cpu
If fails (to me) when:
- always if group_sz is 64;
- only if group_sz is 64;
- It seems that more than one type of failure happens (?).
Probably something unaligned.
I managed to reproduce now. The trick is to use eg. OMP_NUM_THREADS=3
but it still doesn't happen on Linux, only the M1. My current fixes in #5299 doesn't help.
I don't quite understand this backtrace.
Process 73326 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
frame #0: 0x00000001021e3328 libcrypto.3.dylib`_armv8_sve_probe
libcrypto.3.dylib`:
-> 0x1021e3328 <+0>: eor z0.d, z0.d, z0.d
0x1021e332c <+4>: ret
libcrypto.3.dylib`:
0x1021e3330 <+0>: xar z0.d, z0.d, z0.d, #0x20
0x1021e3334 <+4>: ret
Target 0: (john) stopped.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_INSTRUCTION (code=1, subcode=0x4a03000)
* frame #0: 0x00000001021e3328 libcrypto.3.dylib`_armv8_sve_probe
frame #1: 0x00000001021e3830 libcrypto.3.dylib`OPENSSL_cpuid_setup + 600
frame #2: 0x000000019029c1e0 dyld`invocation function for block in dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const::$_0::operator()() const + 168
frame #3: 0x00000001902dde94 dyld`invocation function for block in dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 340
frame #4: 0x00000001902d11a4 dyld`invocation function for block in dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 528
frame #5: 0x000000019027c2d8 dyld`dyld3::MachOFile::forEachLoadCommand(Diagnostics&, void (load_command const*, bool&) block_pointer) const + 296
frame #6: 0x00000001902d01cc dyld`dyld3::MachOFile::forEachSection(void (dyld3::MachOFile::SectionInfo const&, bool, bool&) block_pointer) const + 192
frame #7: 0x00000001902dd958 dyld`dyld3::MachOAnalyzer::forEachInitializer(Diagnostics&, dyld3::MachOAnalyzer::VMAddrConverter const&, void (unsigned int) block_pointer, void const*) const + 516
frame #8: 0x0000000190298864 dyld`dyld4::Loader::findAndRunAllInitializers(dyld4::RuntimeState&) const + 448
frame #9: 0x0000000190298c18 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 220
frame #10: 0x0000000190298bf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 184
frame #11: 0x0000000190298bf4 dyld`dyld4::Loader::runInitializersBottomUp(dyld4::RuntimeState&, dyld3::Array<dyld4::Loader const*>&) const + 184
frame #12: 0x000000019029c26c dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const::$_1::operator()() const + 112
frame #13: 0x0000000190298d98 dyld`dyld4::Loader::runInitializersBottomUpPlusUpwardLinks(dyld4::RuntimeState&) const + 304
frame #14: 0x00000001902bc984 dyld`dyld4::APIs::runAllInitializersForMain() + 468
frame #15: 0x00000001902812d0 dyld`dyld4::prepare(dyld4::APIs&, dyld3::MachOAnalyzer const*) + 3480
frame #16: 0x000000019027fe18 dyld`start + 1964
Not a single line of it refers to john, does it?
OK, that lldb crash happens with any format - must be some totally unrelated problem with lldb itself (it and the gcc are the ones included in MacOS though).
Without lldb working, my only clue is this:
Will run 3 OpenMP threads
Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (3xOMP) Bus error: 10
I think I'll try installing a real gcc and gdb and see what happens. Edit: Apparently I already had gcc-12 and that's what built john. But Homebrew doesn't supply gdb for arm64. I'll try building with native clang "gcc" and see if anything changes.
Unfortunately a build with native clang "gcc" never segfaults, so I can't reproduce. I even tried --format=cpu
in case some other format would be interfering.
Lots of comments in #5299 but here's a better place.
I no longer think this is about alignment at all. Putting that debug print with correct brackets results in a never segfault nor fail (and never shows misalignment).
And to recap, when compiled with GNU gcc it never crashes and we've also ruled out OpenSSL as --without-openssl
doesn't make a difference.
Oh, and also my fixes for the aliasing violations doesn't make a difference - so it's not the compiler acting up on UB.
The mentions of __stack_chk_fail
in backtraces suggest that we have an out of bounds write on the stack.
If we can detect the issue with ASan, then that's what we should be using to debug it - not later crashes in a non-ASan build. Can you show that report from ASan you're referring to?
Note to self: I couldn't use --enable-asan
and --without-openssl
together. Build failed (at least make debug
did) with complaints about AES stuff.
Edit: It's the combo --without-openssl
and make debug
that fails. Opening an issue for that.
I can't reproduce any problem with ASan enabled. I tried the #5299 branch as well as current bleeding-jumbo, make debug
or not, and threads between 1 and 29.
ASAN shows the line where there is a buffer overflow (when, for example) zeros are moved to a variable (a private group_sz).
What did you refer to, @claudioandre-br? Can you post that ASan report? Thanks!
Not many build logs left, I didn't find anything. When I have news I'll post.
If you could get a core file after such crash it would probably help
Ubuntu 22 LTS (easy to reproduce using a cloud ARM instance and a simple --test --format=sunmd5
command line).
gcc version: 11.4.0
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) [New Thread 0x4000016fa080 (LWP 9426)]
*** stack smashing detected ***: terminated
*** stack smashing detected ***: terminated
Thread 2 "john" received signal SIGABRT, Aborted.
(gdb) bt
#0 __pthread_kill_implementation (threadid=70368768270464, signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
#1 0x00004000006ff254 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at ./nptl/pthread_kill.c:78
#2 0x00004000006ba67c in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00004000006a7130 in __GI_abort () at ./stdlib/abort.c:79
#4 0x00004000006f3308 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x4000007d3778 "*** %s ***: terminated\n") at ../sysdeps/posix/libc_fatal.c:155
#5 0x0000400000775868 in __GI___fortify_fail (msg=msg@entry=0x4000007d3760 "stack smashing detected") at ./debug/fortify_fail.c:26
#6 0x0000400000775834 in __stack_chk_fail () at ./debug/stack_chk_fail.c:24
#7 0x0000aaaaaad5601c in crypt_all._omp_fn.0 () at sunmd5_fmt_plug.c:527
#8 0x000040000063b80c in ?? () from /lib/aarch64-linux-gnu/libgomp.so.1
#9 0x0000005e0000005e in ?? ()
$ run/john --test --format=sunmd5
Will run 2 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) *** stack smashing detected ***: terminated
Aborted (core dumped)
$ coredumpctl list
TIME PID UID GID SIG COREFILE EXE SIZE
Fri 2024-01-05 20:44:52 UTC 10102 1001 1002 SIGABRT present /home/circleci/project/run/john 9.3M
core.john.1001.158250485e9b4c2a88e9ee4de80025f4.10102.1704487492000000.zip
I guess for best debugging, this should go along with a corresponding john
binary, but there isn't one in that archive.
Thinking of possible reasons why this issue would only show up on Arm, we could want to try setting SIMD_PARA_MD5
to 2 on an x86-64 SSE*
or 128-bit AVX/XOP build with ASan and see what happens. We rarely use that value on x86_64 - per x86-64.h
we only do on gcc 4.5.x or when building for XOP, which we might not have tried combining with ASan.
If you enable ASAN, no errors are seen:
A regular ./configure --enable-asan && make ...
Should we add something?
- -D_FORTIFY_SOURCE=1
- fstack-protector
- fstack-protector-all
gcc -DAC_BUILT -DJOHN_ASIMD -c -g -Og -fsanitize=address -I/usr/local/include -DARCH_LITTLE_ENDIAN=1 -DPREFER_FLOCK -Wall -fno-omit-frame-pointer -Wno-deprecated-declarations -Wformat-extra-args -Wunused-but-set-variable -Wdate-time -D_POSIX_SOURCE -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -fopenmp -I/usr/local/include -DCL_SILENCE_DEPRECATION -funroll-loops sunmd5_fmt_plug.c -o sunmd5_fmt_plug.o
$ while ../run/john --test --format=sunmd5; do :; done
[...]
Will run 2 OpenMP threads
NOTE: This is a debug build, speed will be lower than normal
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw: 173 c/s real, 87.2 c/s virtual
[...]
The non-simd build works fine.
If you remove the stack protection (to get another hint).
You get a bus error.
Will run 2 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (2xOMP) Bus error (core dumped)
And gdb is useless.
(gdb) bt
#0 0x0000002300000022 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Maybe it is possible to reproduce using X86_64.
(gdb) bt
#0 SIMDmd5body (_data=<optimized out>, out=0x555559309c00, reload_state=reload_state@entry=0x0, SSEi_flags=SSEi_flags@entry=0) at simd-intrinsics.c:408
#1 0x000055555571fd73 in crypt_all._omp_fn.0 () at sunmd5_fmt_plug.c:789
#2 0x00007ffff7969a16 in GOMP_parallel (fn=fn@entry=0x55555571e8c0 <crypt_all._omp_fn.0>, data=data@entry=0x7fffffffd9e0, num_threads=8, num_threads@entry=0, flags=flags@entry=0)
at ../../../src/libgomp/parallel.c:178
#3 0x000055555571dfe8 in crypt_all (pcount=<optimized out>, salt=<optimized out>) at sunmd5_fmt_plug.c:527
#4 0x000055555573a0fb in is_key_right (format=format@entry=0x5555559005a0 <fmt_sunmd5>, index=index@entry=19, binary=binary@entry=0x555556b66f24,
ciphertext=ciphertext@entry=0x5555558690f8 "$md5$rounds=904$Vc3VgyFx44iS8.Yu$Scf90iLWN6O6mT9TA06NK/", plaintext=plaintext@entry=0x55555587315f "test",
is_test_fmt_case=is_test_fmt_case@entry=0, dbsalt=0x5555592fd670) at formats.c:592
#5 0x000055555573d2b0 in is_key_right (is_test_fmt_case=0, dbsalt=0x5555592fd670, plaintext=0x55555587315f "test",
ciphertext=0x5555558690f8 "$md5$rounds=904$Vc3VgyFx44iS8.Yu$Scf90iLWN6O6mT9TA06NK/", binary=<optimized out>, index=19, format=0x5555559005a0 <fmt_sunmd5>) at formats.c:588
#6 fmt_self_test_body (full_lvl=-1, db=0x555558fdfcc0, salt_copy=<optimized out>, binary_copy=<optimized out>, format=0x5555559005a0 <fmt_sunmd5>) at formats.c:1295
#7 fmt_self_test (format=format@entry=0x5555559005a0 <fmt_sunmd5>, db=db@entry=0x555558fdfcc0) at formats.c:2045
#8 0x000055555572c87d in benchmark_all () at bench.c:876
#9 0x000055555574585a in john_run () at john.c:1672
#10 main (argc=<optimized out>, argv=<optimized out>) at john.c:2092
diff --git a/src/x86-64.h b/src/x86-64.h
index f77c44ac13..39eec60dca 100644
--- a/src/x86-64.h
+++ b/src/x86-64.h
@@ -214,7 +214,7 @@
#define SIMD_COEF_32 16
#define SIMD_COEF_64 8
#elif __AVX2__
-#define SIMD_COEF_32 8
+#define SIMD_COEF_32 4
#define SIMD_COEF_64 4
#elif __SSE2__
#define SIMD_COEF_32 4
@@ -258,7 +258,7 @@
#elif defined(__GNUC__) && (GCC_VERSION < 40600 || defined(__XOP__)) // 4.6.0
#define SIMD_PARA_MD5 2
#else
-#define SIMD_PARA_MD5 3
+#define SIMD_PARA_MD5 2
#endif
#endif
If that is correct, the format should not use SIMD at all, it is buggy.
Thank you @claudioandre-br for these additional tests.
If you enable ASAN, no errors are seen:
If you remove the stack protection (to get another hint).
I find these comments confusing, as it's not clear what they're comparing against. Surely we did see the crash in CircleCI with ASan enabled, so why wouldn't we in some other(?) test with ASan (and in what way it differs)? Perhaps you do know what compiler flags were used in CircleCI? Perhaps you also know where you ran those other tests (so can mention that)?
#elif __AVX2__ -#define SIMD_COEF_32 8 +#define SIMD_COEF_32 4
That's wrong, things are not supposed to work with incorrect settings like that. To correctly obtain SIMD_COEF_32 = 4
, simply build with --enable-simd=avx
.
Surely we did see the crash in CircleCI with ASan enabled
I'm not sure if I've done this test previously with ASAN enabled. I removed or edited some previous comments as "everything was" 100% remote and only sure a regular build fails on mac M2 [1].
Now, using SSH (so, I can rebuild the way I want, re-test and so son), I was unable to crash when enabling ASAN.
Even forcing CPPFLAGS=sanitize=address (-O2 is used in these cases).
So, I didn't receive anything from ASAN.
[1]
- Also, inside Launchpad when deploying the ARM snap package. Less than 10 times, in the last 5 months maybe.
- And now, using an ARM machine via SSH.
I can't reproduce this problem on an aarch64
box here.
ubuntu@ubuntu:~/john/src$ uname -a
Linux ubuntu 5.10.160-rockchip #34 SMP Thu Mar 21 16:59:26 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux
ubuntu@ubuntu:~/john/src$ ../run/john --test --format=sunmd5
Will run 8 OpenMP threads
Benchmarking: SunMD5 [MD5 128/128 ASIMD 4x2]... (8xOMP) DONE
Speed for cost 1 (iteration count) of 5000
Raw: 1280 c/s real, 207 c/s virtual
ubuntu@ubuntu:~/john/src$ cat /proc/cpuinfo
processor : 0
BogoMIPS : 48.00
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp
CPU implementer : 0x41
CPU architecture: 8
CPU variant : 0x2
CPU part : 0xd05
CPU revision : 0
ubuntu@ubuntu:~/john/src$ cat /etc/lsb-release
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=22.04
DISTRIB_CODENAME=jammy
DISTRIB_DESCRIPTION="Ubuntu 22.04.4 LTS"
ubuntu@ubuntu:~/john/src$ gcc --version
gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Thank you @kholia. It's great that you're still around!
I don't recall all detail of this issue (was the issue disappearing with ASan for us?), but if you have time you could also try an --enable-asan
build.
The last time I saw it failing was on Apr 7 at https://launchpadlibrarian.net/723587866/buildlog_snap_ubuntu_jammy_arm64_john-the-ripper_BUILDING.txt.gz
Random. A second run managed to succeed.
:: Testing: Stribog-512, raw Streebog [GOST R 34.11-2012 64/64]... (4xOMP) PASS
:: Testing: STRIP, Password Manager [PBKDF2-SHA1 128/128 ASIMD 4x]... (4xOMP) PASS
:: Testing: SunMD5 [MD5 128/128 ASIMD 4x2]... (4xOMP) *** stack smashing detected ***: terminated
:: run_tests.sh: line 93: 22374 Aborted (core dumped) "$JTR_BIN" -test-full=0 --format=cpu
:: FAILED: -test-full=0 --format=cpu