design icon indicating copy to clipboard operation
design copied to clipboard

Branch hinting

Open alexp-sssup opened this issue 5 years ago • 9 comments

As you may be aware, Leaning Technologies, is working on a Wasm powered x86 virtualization product named CheerpX, which is in an advanced stage of development. I have added further reading below if anybody is interested in more context.

We generate optimized Wasm bytecode Just-In-Time, and it is fairly common to have code like the following:

...
opcode
opcode
i32.eq
if
  call $slowPath
end
...

Due to the condition involved, the if body is executed very rarely (e.g. <1% probability). It would be advantageous to provide information to the Wasm VM to let it make better decisions in these cases, in particular with regard to:

  1. Code layout: the if body could be put out-of-line at the very end of the function to keep "hot" code compact in the instruction cache
  2. Native hinted branch instructions: if appropriate for the target platform

It also seems to us that branch hinting could be useful not only in our scenario, but also for other Wasm based VMs for dynamic languages or even PGO compiled C++ sources.

Before trying to develop a full proposal we wanted to gather feedback on the few approaches we have though about

  1. Introduction of a new opcode (if_hinted), with an immediate parameter (ether 0/1) to indicate the direction of the hint. Possibly 2/-1 could be used to signal "no hint" or other semantics.
  2. Using the type immediate of the existing if opcode to also encode the hint. For example we could decide that hinting can only be used with void/0x40 block type, and encode the hints as 0x41/0x42. Apologies if those specific values are already reserved for other purposes, this is just intended as an example.
  3. We could also introduce a different opcode, for example assume or expect which works as a pass-trough, consuming one value and pushing the same one afterwards. This could be used to effectively "annotate" or add metadata to a specific value on the stack. This metadata can then be used by a branch instruction to deduce the hint direction.
  4. If the consumption of bytes in the main opcode space is a concern, it seems also possible to define the new opcode into a prefixed space.

From the implementation point of view this concept fit very well in V8, and we have hacked a prototype already. In particular V8 already internally support branch hints and "deferred" blocks which can be naturally used to implement this feature. We are willing to share a refined prototype after the first round of feedback, if there is interest from the community. We have not investigated what would be the implementation complexity in other engines, on the upside a "stub" implementation that correctly parses the information without acting on it seems to be not difficult in general.

============ Further reading about our technology (for context, should not be required for the discussion):

Presentation at Wasm SF Meetup: https://www.youtube.com/watch?v=7JUs4c99-mo (first 30sec of audio missing, apologies) Python3 Demo: https://www.leaningtech.com/pages/pythondemo.html https://medium.com/leaningtech/extreme-webassembly-1-pushing-browsers-to-their-absolute-limits-56a393435323 https://medium.com/leaningtech/preserving-flash-content-with-webassembly-done-right-eb6838b7e36f https://medium.com/leaningtech/running-flash-in-webassembly-using-cheerpx-an-update-d500b6fbc44e

alexp-sssup avatar Aug 13 '20 09:08 alexp-sssup

FWIW, I think adding new instructions would be the cleanest approach here, but a downside is that we have a growing number of branch instructions (e.g. br_on_exn, casting instructions) in addition to the if-else and br_if available in MVP. Would we want hinted versions of all of these instructions? Having assume or expect instructions would scale better in this sense.

I would also be curious to see data on the potential performance gains this extension would unlock; I don't have a good intuition for it.

tlively avatar Sep 11 '20 19:09 tlively

What about an branch-hint instruction that indicates the hint for the next following branch-like bytecode? It would be a no-op; i.e. it would have no effect on execution. That way you need not add a new zoo of hinted versions of br* and if. The branch hint could take an i32 immediate with 0 indicating not taken, 1 indicating taken, and N indicating the most likely target for a br_table.

titzer avatar Sep 13 '20 02:09 titzer

So it's to be x86-style prefixes then? 😁

taralx avatar Sep 13 '20 03:09 taralx

I agree with @titzer's approach, and I think we'll likely see more examples of these sorts of "modifying" instructions. The unwinding idea in WebAssembly/exception-handling#124 is a similar example, and you could imagine tail calls being expressed by a returning modifier for calling instructions rather than adding a return_ variant of each calling instruction.

RossTate avatar Sep 13 '20 14:09 RossTate

Similarly, there was earlier discussion on defining exception-throwing versions of trapping numeric operations (like division by 0). Are we ready to open the can of worms?

conrad-watt avatar Sep 13 '20 14:09 conrad-watt

Similarly, many of the atomic operations could have been expressed by using an atomic prefix. However, the branch hint prefix feels like a slightly different case, since it behaves as a no-op, rather than modifying the behavior of the existing instruction.

binji avatar Sep 14 '20 16:09 binji

The no-op branch-int instruction seems like a good idea. I will add it to the presentation for Tuesday

yuri91 avatar Sep 25 '20 09:09 yuri91

I started drafting a repo for this proposal here: https://github.com/yuri91/branch-hinting

There is a very brief overview, and a folder with a minimal benchmark.

The benchmark uses the br_on_null instruction, which currently in v8 behaves like an hinted branch (the null case is considered unlikely), so there is no need to modify the engine code.

Switching from a normal br_if to the br_on_null improves performance by about 30% on my machine. Of course it is a very contrived example, but useful nonetheless I think.

Any feedback is appreciated.

yuri91 avatar Oct 09 '20 13:10 yuri91

I think hinting should be more general and use probability similar to __builtin_expect_with_probability.

(module
  (func $foo (param $x i32) (result i32)
    (if (result i32)
      (expect value=1 probability=70
        (i32.eqz
          (local.get $x)
        )
      )
      i32.const 1
      i32.const 2
    )
  )
)

MaxGraey avatar Nov 10 '20 00:11 MaxGraey