wasmtime
wasmtime copied to clipboard
Change the function alignment of x86 to 32-bytes
The current default 16-bytes function alignment for x86-64 would cause suboptimal execution performance under some cases which are reported in https://github.com/bytecodealliance/wasmtime/issues/8573.
Based on the discussion "the CPU frontend grabs an aligned 32B or 64B chunk at a time" in https://github.com/bytecodealliance/wasmtime/issues/8573, this PR changes the default alignment from 16-bytes to 32-bytes for better performance.
Also rerun the cases reported in https://github.com/bytecodealliance/wasmtime/issues/8573 and the execution time will back to normal.
# After changes
➜ case ✗ wasmtime compile good.wasm -o good.cwasm
➜ case ✗ wasmtime compile bad.wasm -o bad.cwasm
➜ case ✗ time wasmtime run --allow-precompiled good.cwasm
~/wasmtime/target/release/wasmtime run good.cwasm 4.68s user 0.00s system 100% cpu 4.680 total
➜ case ✗ time wasmtime run --allow-precompiled bad.cwasm
~/wasmtime/target/release/wasmtime run bad.cwasm 4.67s user 0.01s system 99% cpu 4.681 total
# Before changes
➜ case ✗ wasmtime compile good.wasm -o good.cwasm
➜ case ✗ wasmtime compile bad.wasm -o bad.cwasm
➜ case ✗ time wasmtime run --allow-precompiled good.cwasm
~/wasmtime/target/release/wasmtime run good.cwasm 4.67s user 0.00s system 100% cpu 4.674 total
➜ case ✗ time wasmtime run --allow-precompiled bad.cwasm
~/wasmtime/target/release/wasmtime run bad.cwasm 9.36s user 0.01s system 100% cpu 9.365 total
Thanks for this! It looks like this is causing changes in the disassembly of some existing tests. You can update the disassembly with WASMTIME_TEST_BLESS=1 cargo test --test disas
locally and commit those changes to get pushed up here