wasmi icon indicating copy to clipboard operation
wasmi copied to clipboard

Optimize instruction dispatch

Open Robbepop opened this issue 3 years ago • 4 comments

The majority of the overhead of interpreters and in particular wasmi interpreter is the overhead of the instruction dispatch. Therefore there are 3 main ways to improve efficiency of efficient interpreters:

  • Improve the performance of the dispatch routines, i.e. reduce their overhead.
  • Reduce the amount of executed instructions, e.g. by combining instructions into super instructions.
  • Help the CPU branch predictor to correctly predict the next branch. This is due to the fact that instruction dispatch usually consists of at least one indirect branch. It is possible to help the CPU utilize better branch prediction by providing it with more information. For example having only a single branch when using a single match statement for the dispatch routine is less efficient than having a branch per instruction (match arm) since the branch predictor can include the position of the branch into account for its prediction. Some benchmark indicate 50%-100% performance gains.

Work Items

  • [ ] Fuse common instruction sequences into super instructions for wasmi bytecode during Wasm module compilation.
    • https://github.com/paritytech/wasmi/issues/325
  • [ ] LLVM is able to optimize switch based dispatch into one where branch predictors will benefit more at the cost of increased binary size. LLVM usually opts out of this to our despair. It might be possible to find ways to make LLVM optimize into that form from within Rust.
  • [ ] LLVM already supports guaranteed tail calls. As soon as Rust provides them too we should definitely experiment with dispatch based on tail calls similar to the Wasm3 interpreter.

Robbepop avatar Jan 08 '22 09:01 Robbepop

This architecture could be used to speed up instruction dispatch in wasmi with safe Rust code: https://github.com/Neopallium/s1vm

Robbepop avatar Feb 12 '22 10:02 Robbepop

This article well describes different instruction dispatch techniques and their expected performance: https://www.complang.tuwien.ac.at/forth/threaded-code.html

Robbepop avatar Feb 12 '22 10:02 Robbepop

Research into different instruction dispatch techniques implementable in Rust: https://github.com/Robbepop/interpreter-dispatch-research

Robbepop avatar Feb 17 '22 16:02 Robbepop

PR merged to refactor the instruction dispatch for great wins: https://github.com/paritytech/wasmi/pull/376

Robbepop avatar Jul 13 '22 16:07 Robbepop

Closed since all TODO items have been answered or resolved.

Robbepop avatar Aug 22 '22 18:08 Robbepop