Chris Fallin
Chris Fallin
One reason to try to lift optimizations like this into ISLE, and keep the VCode insts as close to 1-to-1 correspondence with machine code as possible, is that it makes...
This depends on #4992 -- a little bit of core runtime functionality (trap handling, unwind info generation) necessary for this OS/architecture pair. If you're willing to work on this, we'd...
@afonso360, if you still have access to your RISC-V hardware and a bit of time to test, it would be useful to confirm that this changes the result of the...
Probably the latter -- the loads are not removed, and make it all the way to machine code.
(More specifically, a load inserted by this pass *could* later be removed by alias analysis, but only if subsumed by another earlier load to the same address, so there is...
One other performance-related note: this will possibly have worse impact when the pooling allocator with copy-on-write is enabled in Wasmtime, as the "first touch" to a page matters: if demanded...
The problem I see with that approach is that we only know that dynamically: consider if we have a 64-bit store (allowed to be unaligned) to address `X`, we would...
I think that would work, yeah. There are some other interesting performance tradeoffs here -- two that come to mind are "partial-store forwarding" (partially overlapping loads/stores can be problematic for...
Ah yes, indeed. This is only a problem on aarch64 and riscv64, and AFAICT all aarch64 hardware one would *want* to run a high-performance server on (Apple implementations, AWS machines,...
IIRC, the "legacy" passes we call before egraph opt are the ones that are important for compile time, or at least were when egraph opt was introduced. Basically as you...