wasm_runtime_detect_native_stack_overflow works incorrectly under ASAN
wasm_runtime_detect_native_stack_overflow function compares address of local variable with stack boundary here
https://github.com/bytecodealliance/wasm-micro-runtime/blob/a6a9f1f45d9f7ebf044ceac71bcf7a9ea2f90f23/core/iwasm/common/wasm_runtime_common.c#L7898
But under ASAN local variables placed on "Fake stack" https://github.com/google/sanitizers/wiki/AddressSanitizerUseAfterReturn#algorithm , so this comparison often produces wrong results, reporting "native stack overflow".
I have no minimal reproducible example for WAMR itself, but here is how that API works under Linux: Test file
#define _GNU_SOURCE
#include <pthread.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
int dummy;
pthread_t self;
pthread_attr_t attr;
size_t stack_size;
void* addr;
self = pthread_self();
if (pthread_getattr_np(self, &attr) != 0) {
printf("Failed get attr\n");
return 1;
}
pthread_attr_getstack(&attr, &addr, &stack_size);
printf("Stack %p size %zX; Current frame %p\n", addr, stack_size, &dummy);
pthread_attr_destroy(&attr);
return 0;
}
Normal run
clang-18 test.c && ./a.out
Stack 0x7ffcc529e000 size 7FF000; Current frame 0x7ffcc5a9ce4c
You see, 0x7ffcc529e000 is less then 0x7ffcc5a9ce4c, so everything works as expected.
ASAN run
clang-18 -fsanitize=address test.c && ./a.out
Stack 0x7fff01009000 size 7FE000; Current frame 0x7fa947d00020
0x7fff01009000 is bigger then 0x7fa947d00020, so check based on this assumption may wrongly decide that "stack already overflown"
Your environment
- Linux
I suggest disabling all logic in wasm_runtime_detect_native_stack_overflow if condition #if __has_feature(address_sanitizer) is true.
If this is OK, I can provide patch.
That's strange, we do have ASAN CI enabled and don't see any issue yet. Would you mind trying this sample: native-stack-overflow to see whether it will error?
PS: I think if we do need to modify the code, we should use __SANITIZE_ADDRESS__ to disable the logic; __has_feature(address_sanitizer) is only for clang if I remember correctly.
Tested native-stack-overflow on my machine on commit a6a9f1f45d9f7ebf044ceac71bcf7a9ea2f90f23.
Without sanitizers it does not work
./run.sh
====== Interpreter test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
unhandled SIGSEGV, si_addr: (nil)
Aborted (core dumped)
With ASAN it produces following output:
./run.sh
====== Interpreter test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 16 | failed | ok | Exception: native stack overflow
16 - 24576 | failed | ok | Exception: invalid exec env
====== Interpreter WAMR_DISABLE_HW_BOUND_CHECK=1 test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 24576 | failed | ok | Exception: native stack overflow
====== AOT test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 16 | failed | ok | Exception: native stack overflow
16 - 24576 | failed | ok | Exception: invalid exec env
====== AOT w/ signature test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 16 | failed | ok | Exception: native stack overflow
16 - 24576 | failed | ok | Exception: invalid exec env
====== AOT WAMR_DISABLE_HW_BOUND_CHECK=1 test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 24576 | failed | ok | Exception: native stack overflow
====== AOT w/ signature WAMR_DISABLE_HW_BOUND_CHECK=1 test1
stack size | fail? | leak? | exception
---------------------------------------------------------------------------
0 - 24576 | failed | ok | Exception: native stack overflow
I checked what happens under GDB for first test case - and seems it fails on the very first stack check. Here is stack:
(gdb) bt
#0 0x0000555555574688 in wasm_set_exception_local (exception=0x5555555fe9c0 "native stack overflow", module_inst=0x516000000080)
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:3076
#1 wasm_set_exception (module_inst=0x516000000080, exception=exception@entry=0x5555555fe9c0 "native stack overflow")
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:3099
#2 0x0000555555574a3d in wasm_runtime_set_exception (module_inst_comm=<optimized out>,
exception=exception@entry=0x5555555fe9c0 "native stack overflow")
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:3192
#3 0x000055555557a0e7 in wasm_runtime_detect_native_stack_overflow (exec_env=exec_env@entry=0x524000002100)
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:7899
#4 0x000055555557a80f in call_wasm_with_hw_bound_check (module_inst=module_inst@entry=0x516000000080,
exec_env=exec_env@entry=0x524000002100, function=function@entry=0x513000000130, argc=argc@entry=2,
argv=argv@entry=0x7ffff5700030) at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/interpreter/wasm_runtime.c:3609
#5 0x000055555557bd39 in wasm_call_function (exec_env=exec_env@entry=0x524000002100, function=function@entry=0x513000000130,
argc=argc@entry=2, argv=argv@entry=0x7ffff5700030)
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/interpreter/wasm_runtime.c:3689
#6 0x00005555555745f1 in wasm_runtime_call_wasm (exec_env=exec_env@entry=0x524000002100, function=0x513000000130,
argc=argc@entry=2, argv=argv@entry=0x7ffff5700030)
at /home/vchigrin/projects/wasm-micro-runtime/core/iwasm/common/wasm_runtime_common.c:2666
#7 0x00005555555702cf in main (argc=<optimized out>, argv=<optimized out>)
at /home/vchigrin/projects/wasm-micro-runtime/samples/native-stack-overflow/src/main.c:161
Several observations:
- The native-stack-overflow samples work well with and without
-DWAMR_BUILD_SANITIZER=asanon my local Ubuntu 22.04 LTS environment. - AddressSanitizerUseAfterReturn is off by default. The "fake stack" should not be involved unless another configuration enables it.
- Both GCC and Clang have some kind of ignore list feature, IIUC For example, see https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-finstrument-functions-exclude-function-list. It might be better to try them first.
Tested native-stack-overflow on my machine on commit
a6a9f1f45d9f7ebf044ceac71bcf7a9ea2f90f23. Without sanitizers it does not work./run.sh ====== Interpreter test1 stack size | fail? | leak? | exception --------------------------------------------------------------------------- unhandled SIGSEGV, si_addr: (nil) Aborted (core dumped)
can you file a separate bug with a bit more details about this? i couldn't reproduce it. (macOS, x86-64, clang)