binaryen icon indicating copy to clipboard operation
binaryen copied to clipboard

Integrating LLVM optimizations with wasm-opt

Open xuruiyang2002 opened this issue 7 months ago • 3 comments

This draft is about leveraging llvm opt to benefiting wasm-opt.

Languages like C/C++ and Rust are from LLVM and benefit a lot. However, not all come from LLVM (GC languages like Java, Kotlin, Dart, etc). wasm-opt wishes to take the role of a toolchain optimizer but cannot do optimizations due to the AST level optimizations. For example, wasm-opt cannot optimize the redundant store (first one):

    ;; Store 1 into memory at address 0:
    (i32.const 0)     
    (i32.const 1)     
    (i32.store)       
    
    ;; Store 0 into memory at address 0:
    (i32.const 0)     
    (i32.const 0)     
    (i32.store)       
  
    ;; Load the value from memory address 0 and return it:
    (i32.const 0)     
    (i32.load)

The general idea is: translate Binaryen IR (from LLVM-compatible code) into LLVM IR, let llvm-opt optimize it, and then get back the optimized result . The most closely related work is Speeding up SMT Solving via Compiler Optimization (FSE 2023), which uses a similar approach by translating SMT queries into LLVM IR to benefit from LLVM optimizations.

An earlier prototype implementing this idea can be found in this PR: https://github.com/WebAssembly/binaryen/compare/main...kripken:binaryen:llvm. That experiment used existing tools like wabt, emcc, and llvm-opt, but a direct 1-to-1 translation may be better.

(I'll continue this if time allows)

xuruiyang2002 avatar Jun 02 '25 15:06 xuruiyang2002

I think there is a lot of potential here!

Btw, I remembered in https://github.com/WebAssembly/binaryen/issues/7637#issuecomment-2940584308 that our dataflow IR may be useful here, which is SSA-like:

https://github.com/WebAssembly/binaryen/tree/main/src/dataflow

There is a simple pass that does so,

https://github.com/WebAssembly/binaryen/blob/main/src/passes/DataFlowOpts.cpp

I'm not sure, but an option might be to use the existing Binaryen IR => DataFlow IR, and add DataFlow IR => LLVM IR (and the last part could be simpler since it would be SSA => SSA).

kripken avatar Jun 04 '25 16:06 kripken

There is now a proposal to add wasm input to upstream LLVM:

https://discourse.llvm.org/t/rfc-mlir-dialect-for-webassembly/86758

If accepted, that could be very useful here, as it would let some wasm modules be read by LLVM, optimized, and re-emitted as LLVM.

They will never support all of wasm (like GC, I assume), but we could do work on our side to "filter" out the parts they can't handle, let them optimize, and then re-apply the filtered parts, something like that. That might still be a lot of work for us, but a lot less than otherwise.

kripken avatar Jun 12 '25 18:06 kripken

Thanks for sharing, and I'll read it carefully.

xuruiyang2002 avatar Jun 13 '25 12:06 xuruiyang2002