wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

Changing an immediate operand from 0 to 1 makes suspicion performance decreasing

Open hungryzzz opened this issue 8 months ago • 1 comments

Test Cases

cases.zip

Steps to Reproduce

Hi, I run the attached two cases(good.wasm&bad.wasm) in Wasmtime and WasmEdge(AOT), and collect their execution time respectively (measured by time tool).

# command to collect execution time of wasmtime
wasmtime compile bad.wasm -o bad.cwasm
time wasmtime run --allow-precompiled bad.cwasm

# command to collect execution time of wasmedge
wasmedgec bad.wasm bad-wasmedge-aot.wasm
time wasmedge bad-wasmedge-aot.wasm

Expected Results & Actual Results

For good.wasm, the execution time in different runtimes are as follows:

  • Wasmtime: 1.42s
  • WasmEdge: 1.05s

For bad.wasm, the execution time in different runtimes are as follows:

  • Wasmtime: 3.29s
  • WasmEdge: 0.91s

The difference between the attached two cases is as follow, i.e., changing one of the operand of i32.and from 0 to 1. The difference and bring 1.8s performance decreasing on Wasmtime but has no negative effect on WasmEdge.

➜  cases diff good.wat bad.wat
35c35
<             i32.const 0
---
>             i32.const 1
;; part of good.wat
  if
    i32.const 1
    local.set 2
    i32.const 1
    local.get 0
    i32.const 0 ;; here
    i32.and
    i32.add
    local.set 6
    br 1

;; part of bad.wat
if 
  i32.const 1
  local.set 2
  i32.const 1
  local.get 0
  i32.const 1 ;; here
  i32.and
  i32.add
  local.set 6
  br 1

I check the machine code generated by Wasmtime and WasmEdge respectively, the instruction numbers of func2(the difference is in it) are as follows:

  • WasmEdge

    • good.wasm: 89
    • bad.wasm: 92
    • diff: 3
  • Wasmtime

    • good.wasm: 132
    • bad.wasm: 157
    • diff: 25

I think maybe the difference affect the data flow analysis during optimization so maybe some optimization decisions are different , but I really confuse why the instructions number of generated machine code could change so much, and I think maybe the extra instructions in bad.wasm make it slower than the good one. I also use Perf tool to profile the hotspots in Wasmtime but I cannot find something useful.

Versions and Environment

  • Wasmtime version or commit: 7f7064c74
  • Operating system: Linux ringzzz-OptiPlex-Micro-Plus-7010 6.5.0-18-generic
  • Architecture: Intel(R) Core(TM) i5-13500

hungryzzz avatar May 29 '24 16:05 hungryzzz