wasmi icon indicating copy to clipboard operation
wasmi copied to clipboard

Optimization: Inline caching of the top-most stack value

Open Robbepop opened this issue 3 years ago • 3 comments

Currently the wasmi Wasm execution engine follows a stack machine based execution model. This means that most instructions pop of values from the stack, manipulate them and push the result back to the execution stack.

This has some obvious performance impacts and the performance of the ValueStack data structure used in the execution engine is of critical importance.

However, it is possible to reduce strain on the ValueStack by caching the top-most value of the ValueStack inline in the ExecutionContext or even in the interpreter hot-loop that drives the execution.

Experiments in other Wasm engines have found this to improve performance significantly for compute intense workloads. Call intense workloads might experience slight regressions due to more setup and tear down synchronization required in between calls.

Robbepop avatar Aug 27 '22 10:08 Robbepop

We might require some more Instruction variants that store information about the value stack since special care must be taken when the value stack is empty. The affected instructions are the ones that might appear on an empty value stack in a valid Wasm source and are called provider instructions from now on. The following instructions are affected:

  • local.get: Either needs to update the top most value stack item or exchange the already existing top most value stack item and push the old one onto the value stack.
  • global.get: See local.get.
  • memory.size: See local.get.
  • {i32,i64,f32,f64}.const: See local.get.

Interesting cases:

  • br: No special care is needed since empty value stacks always lead to br having no dropped or kept values. The DropKeep value stack adjustment needs to take the top-most value stack item into consideration. The same applies to br_eqz and br_nez instructions.
  • return: No special care is needed since empty value stacks always lead to return returning no values and if return returns actual values we are guaranteed to not have an empty value stack. Before returning to the caller the inline cached top most value stack item needs to be pushed onto the value stack.
  • call: No special care is needed since empty value stacks always lead to call having no parameters for the called function. If there are parameters for the called function then we are guaranteed that the value stack is non-empty. Before calling a function the inline cached top most value stack item needs to be pushed onto the value stack.

Robbepop avatar Sep 05 '22 15:09 Robbepop

I declare this optimization experiment to have failed with the closing of the respective experimental PR: https://github.com/paritytech/wasmi/pull/429

The problem was twofold:

  1. It introduces way more complexity than I think is reasonable.
  2. Preliminary benchmarks actually displayed performance regressions. Although this might have been a premature test run.

Robbepop avatar Sep 23 '22 11:09 Robbepop

I found this paper which describes exactly this optimization for interpreters. The paper demonstrates 2 different strategies, a compile-time one where all the required information is computed at compile-time and a runtime based one that introduces overhead during runtime to compute the information on the fly. The paper itself says that the latter is not really useful since this whole technique is about optimizing this particular hot-path of a stack machine based interpreter. Therefore we should concentrate on what the paper describes as the compile-time based strategy.

I reopen this issue since we actually still want to implement this optimization eventually. Maybe we just need another try.

Robbepop avatar Oct 02 '22 14:10 Robbepop

This was tried by me personally multiple times already and unfortunately I was never able to reach a point where the performance actually improved with this change. Maybe I have not tried hard enough but I think there are lower hanging fruits to improve performance than this can of worms.

Robbepop avatar Nov 17 '23 20:11 Robbepop