wasmtime
wasmtime copied to clipboard
[cranelift] Avoid 64-bit imul_imm if possible on all architectures
Feature
Not all architectures has a fast 64-bit imul + imm. But even on modern like SnB-family and AMD Ryzen it takes 3 cycle
latency, 1c throughput
which not always faster lea + shl / add combination. So I propose use lowering to lea + shl / add for non-power of two constants ~~at least for imm < 400
~~ with low hamming weight and 64-bit imul only if this possible. Similar to GCC:
https://godbolt.org/z/aG7bPer9v
for non-power of two constants at least for imm < 400
Should this check for a low hamming weight rather than a max value?
low hamming weight rather than a max value?
Yeah, perhaps this will be better