wasmi
wasmi copied to clipboard
Optimize instruction dispatch
The majority of the overhead of interpreters and in particular wasmi interpreter is the overhead of the instruction dispatch.
Therefore there are 3 main ways to improve efficiency of efficient interpreters:
- Improve the performance of the dispatch routines, i.e. reduce their overhead.
- Reduce the amount of executed instructions, e.g. by combining instructions into super instructions.
- Help the CPU branch predictor to correctly predict the next branch. This is due to the fact that instruction dispatch usually consists of at least one indirect branch. It is possible to help the CPU utilize better branch prediction by providing it with more information. For example having only a single branch when using a single
matchstatement for the dispatch routine is less efficient than having a branch per instruction (match arm) since the branch predictor can include the position of the branch into account for its prediction. Some benchmark indicate 50%-100% performance gains.
Work Items
- [ ] Fuse common instruction sequences into super instructions for
wasmibytecode during Wasm module compilation.- https://github.com/paritytech/wasmi/issues/325
- [ ] LLVM is able to optimize
switchbased dispatch into one where branch predictors will benefit more at the cost of increased binary size. LLVM usually opts out of this to our despair. It might be possible to find ways to make LLVM optimize into that form from within Rust. - [ ] LLVM already supports guaranteed tail calls. As soon as Rust provides them too we should definitely experiment with dispatch based on tail calls similar to the Wasm3 interpreter.
This architecture could be used to speed up instruction dispatch in wasmi with safe Rust code:
https://github.com/Neopallium/s1vm
This article well describes different instruction dispatch techniques and their expected performance: https://www.complang.tuwien.ac.at/forth/threaded-code.html
Research into different instruction dispatch techniques implementable in Rust: https://github.com/Robbepop/interpreter-dispatch-research
PR merged to refactor the instruction dispatch for great wins: https://github.com/paritytech/wasmi/pull/376
Closed since all TODO items have been answered or resolved.