OpenMP fails on A64FX (AArch64 SVE)
Raised by user, see https://groups.google.com/g/dynamorio-users/c/S9dHoHjYHns
Describe the bug Any OpenMP program with more than one thread fails on an A64FX SVE machine.
To Reproduce
$ OMP_NUM_THREADS=2 $DYNAMORIO_DIR/bin64/drrun -debug -- ./bin/is.A.x
<Starting application NPB3.4-OMP/bin/is.A.x (4028)>
<Initial options = -no_dynamic_options -code_api -stack_size 64K -signal_stack_size 64K -max_elide_jmp 0 -max_elide_call 0 -vmm_block_size 64K -initial_heap_unit_size 64K -initial_heap_nonpers_size 64K -initial_global_heap_unit_size 512K -max_heap_unit_size 4M -heap_commit_increment 64K -cache_commit_increment 64K -cache_bb_unit_init 64K -cache_bb_unit_max 64K -cache_bb_unit_quadruple 64K -cache_trace_unit_init 64K -cache_trace_unit_max 64K -cache_trace_unit_quadruple 64K -cache_shared_bb_unit_init 512K -cache_shared_bb_unit_max 512K -cache_shared_bb_unit_quadruple 512K -cache_shared_trace_unit_init 512K -cache_shared_trace_unit_max 512K -cache_shared_trace_unit_quadruple 512K -cache_bb_unit_upgrade 64K -cache_trace_unit_upgrade 64K -cache_shared_bb_unit_upgrade 512K -cache_shared_trace_unit_upgrade 512K -early_inject -emulate_brk -no_inline_ignored_syscalls -no_per_thread_guard_pages -native_exec_default_list '' -no_native_exec_managed_code -no_indcall2direct >
NAS Parallel Benchmarks (NPB3.4-OMP) - IS Benchmark
Size: 8388608 (class A)
Iterations: 10
Number of available threads: 2
<Application NPB3.4.2/NPB3.4-OMP/bin/is.A.x (4028). Cannot correctly handle received signal 11 in thread 4029: default action in native thread.>
The crash seems to happen at thread creation when entering an OpenMP parallel region or pthread_create().
Stack trace:
(gdb) bt
#0 0x00004000004574c8 in get_clone_record (xsp=70368753806112) at /home/runner/work/dynamorio/dynamorio/core/unix/signal.c:944
#1 0x000040000043aed8 in new_thread_setup (mc=0x40000092eb20) at /home/runner/work/dynamorio/dynamorio/core/arch/x86_code.c:284
#2 0x0000400200696638 in ?? ()
#3 0x0000ffffffffceb0 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
The user has mentioned this workaround, reducing the size of dstack_base allocated in core/unix/signal.c. By changing:
#ifdef AARCH64
dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE) + PAGE_SIZE;
#else
dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE);
#endif
To:
dstack_base = (byte *)ALIGN_FORWARD(xsp, PAGE_SIZE);
Additional context This error was probably introduced by the initial work on SVE support, see https://github.com/DynamoRIO/dynamorio/pull/5835
It should be fixed when https://github.com/DynamoRIO/dynamorio/issues/6317 is implemented.
Related to https://github.com/DynamoRIO/dynamorio/issues/5365.