wasmtime
wasmtime copied to clipboard
[WIP / RFC] Cranelift: Basic support for EGraph roundtripping.
This is a work-in-progress, and meant to sketch the direction I've been thinking in for a mid-end framework. A proper BA RFC will come soon.
This PR builds a phase in the optimization pipeline that converts a CLIF CFG into an egraph representing the function body. Each node represents an original instruction or operator. The "skeleton" of side-effecting instructions is retained, but non-side-effecting (pure) operators are allowed to "float": the egraph will naturally deduplicate them during build, and we will determine their proper place when we convert back to a CFG representation.
The conversion from the egraph back to the CFG is done via a new algorithm I call "scoped elaboration". The basic idea is to do a preorder traversal of the domtree, and at each level, evaluate the values of the eclasses called upon by the side-effect skeleton, memoizing in an eclass-to-SSA-value map. This map is a scoped hashmap, with scopes at each domtree level. In this way, (i) when a value is computed in a location that dominates another instance of that value, the first replacees the second; but (ii) we never produce "partially dead" computations, i.e. we never hoist to a level in the domtree where a node is not "anticipated" (always eventually computed). This exactly matches what GVN does today. With a small tweak, it can also subsume LICM: we need to be loop-nest-aware in our recursive eclass elaboration, and potentially place nodes higher up the domtree (and higher up in the scoped hashmap).
Unlike what I had been thinking in Monday's meeting, this produces CLIF out of the egraph and then allows that to be lowered. It's overall simpler and a better starting point (thanks @abrown for tipping me over the edge in this). The way it produces CLIF now could be made more efficient: it could reuse instructions already in the DFG for nodes that are not duplicated (likely most of them) rather than clearing all and repopulating.
This PR does not do anything to actually rewrite in the egraph. That's the next step! I need to work out exactly how to integrate ISLE with some sort of rewrite machinery. I have some ideas about efficient dispatch with an "operand-tree discriminants shape analysis" on the egraph and indexing rules by their matched shape; more to come.
cc @fitzgen @jameysharp @abrown @avanhatt @mlfbrown
Subscribe to Label Action
cc @peterhuene
Thus the following users have been cc'd because of the following labels:
- peterhuene: wasmtime:api
To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.
How fast is this code? How much would the increase in compile time for this be?
How fast is this code? How much would the increase in compile time for this be?
I haven't measured yet but I will soon. There are some optimizations I want to do first (e.g., make the egraph-to-CLIF lowering work without rebuilding the whole function body, reusing existing Insts where no node duplication occurs). The path that runs without egraph-based opts will always exist, so debug builds shouldn't be affected; and my hope is that by subsuming GVN, LICM, simple_preopt, and removing the need to run them multiple times, we might not be too much worse or maybe even close to parity when sticking to a "very basic opts" level. That said the goal here will also be to allow more configurability in that tradeoff: eventually, it would be great to have a larger body of rewrite rules, and some framework that lets us configure "fuel" for applying those rules to some configurable limit. (More on this in the RFC for sure, when it comes!)
Subscribe to Label Action
cc @cfallin, @fitzgen
Thus the following users have been cc'd because of the following labels:
- cfallin: isle
- fitzgen: isle
To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.
Superseded by #4953; closing this one!