wasm-micro-runtime A problem about "traps in dead code"

Recently, there is a problem revealed by a series of fuzz test cases, like #2555, #2560, #2562, #2563, ~~#2650~~, #2728, #2732, #2755. All of them share the same pattern in their Wasm modules.

;; a load instruction which will raise "out of bounds access" exception
;; a drop or a kind of branch instruction

If running a above .wasm with --bound-checks=1, "memory bounds protection" code will catch the exception and throw it out and everything is fine.

If running a above .wasm with --bound-checks=0, which is the default mode, WAMR uses a signal trap to replace "memory bounds protection" code. Then, without those protection codes, a closing following drop instruction will make load instruction be marked as dead code(because load is the dominator of drop and there is no other "branch" for load). The result is since both "load" and "drop" are removed by optimizer there is no "out of bounds accessing" and there is no SIGSEGV and there is no raised exception. Because of that, iwasm --llvm-jit will behave in a different way sometimes. Others will stop when meeting the "out of bounds access"

What we are struggling with:

Is it an issue we have to fix or is it a acceptable result ❓ The spec tells there should be a trap for "out of bounds access". We also believe optimizer can modify code in order to achieve the best performance(of course, no hurt on final result). Beside, there are options like --bound-checks=1 for users to reproduce the "out of bounds access exception".
If we want to fix it, there are some options. The point is if we have to strictly follow the Spec all the time or not ❓
- Use the optimization level or another switcher to control the "dead code" optimization. It likes --bound-checks=1 but more close to default mode(HW_BOUNDS_CHECK).
- always prevent load instructions from being removed. It will hurt the performance for sure.

Please let us know your opinions.

Nov 16 '23 02:11 lum1n0us

it's trivial to make the loads volatile. i wonder how much performance penalty it would involve for real applications, which is typically already optimized when compiled into wasm.

Nov 16 '23 04:11 yamt

In my opinion, any instructions related to memory, global variables, table and element sections, etc. that involve the state of the runtime execution process shouldn't be optimized as much as possible. And the other runtimes don't seem to have the same optimization you mentioned.

Nov 16 '23 07:11 abc767234318

it's trivial to make the loads volatile. i wonder how much performance penalty it would involve for real applications, which is typically already optimized when compiled into wasm.

It really impacts performance of some workloads, here are the testing results of several benchmarks under <wamr_root>/tests/benchmarks:

coremark: -1.46%
gcc-loops: gcc-loops: -13%
hashset: -2.6%
tsf: -3.6%

p.s. https://llvm.org/docs/LangRef.html#volatile-memory-accesses

Nov 16 '23 07:11 wenyongh

In my opinion, any instructions related to memory, global variables, table and element sections, etc. that involve the state of the runtime execution process shouldn't be optimized as much as possible. And the other runtimes don't seem to have the same optimization you mentioned.

Yes, ideally we had better make them un-optimized. But it really impacts the performance and is almost unneeded in actual workloads since the dead code should have already been optimized by the wasm compiler like wasi-sdk and emsdk, the issue mainly occurs in the fuzz cases. My suggestion is that opt-level=3 for wamrc is to achieve the best performance and by default we disable the Volatile attribute in it to let the developers have the opportunity the get the best performance, and for opt-level smaller than 3, we can enable the Volatile. Or we just add another option for wamrc to enable the Volatile for memory load. How do you think?

We are also checking whether we can keep the memory load IRs during the dead code optimization, e.g. we mark them as Volatile before the dead code optimization pass, and unmark them after the pass. It is a little complex and takes effort.

BTW, from our testing, wasmedge also disables the Volatile for non-atomic memory load in shared library mode by default, you can try:

wasmedgec test.wasm test.so
wasmedge test.so  (or wasmedge --reactor test.so <func_to_call>

Nov 16 '23 08:11 wenyongh

I think it's an acceptable result if it really impacts the performance since dead code appears rarely in real world cases. A warning for elimination of dead load/store instructions is a suitable solution for me.

Nov 16 '23 08:11 unicornt

I think it's an acceptable result if it really impacts the performance since dead code appears rarely in real world cases. A warning for elimination of dead load/store instructions is a suitable solution for me.

Yes, seems it is a good way, we may try outputting a warning to prompt the user to add option for wamrc or iwasm jit to disable the dead code optimization for load instruction if needed. A concern is that the load instructions may be optimized into memcpy or vector operations, in which it is erased also while the memory access boundary check is kept, we will try whether we can distinguish it from dead code optimization.

Nov 17 '23 02:11 wenyongh

wasm-micro-runtime wasm-micro-runtime copied to clipboard

A problem about "traps in dead code"

wasm-micro-runtime
wasm-micro-runtime copied to clipboard