wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

refactor(cranelift-codegen): remove `OnceLock`

Open JonasKruckenberg opened this issue 1 year ago • 7 comments
trafficstars

This removes the dependency on std::sync::OnceLock by lifting the state out of a static and into the Callee struct.

This PR also removes the call_conv parameter since it was not used by any of the trait implementations.

JonasKruckenberg avatar Apr 26 '24 10:04 JonasKruckenberg

ping @elliottt (or other cranelift folks on this), I don't know enough about MachineEnv myself to know if the lift here is reasonable to apply

alexcrichton avatar May 02 '24 17:05 alexcrichton

This could possibly result in a performance degradation in the compiler: it removes the caching of the MachineEnv, which contains lists of registers (heap-allocated Vecs), and recreates it for each function that is compiled. Especially for small functions this could be a measurable/non-negligible portion of the total work.

@JonasKruckenberg, could you say more about the motivation here? And have you measured the performance impact?

cfallin avatar May 02 '24 19:05 cfallin

Yeah thats a concern I had too. The motivation is to support #8341 and since OnceLock is the only thing in cranelift that cant be ported to no_std+alloc or gated behind flags the idea was to replace it by something else. Adding another dependency on something like spin for no_std targets doesn't seem appropriate either.

I'll do some benchmarking next week and report back 👍

JonasKruckenberg avatar May 02 '24 21:05 JonasKruckenberg

Ah, one other option could be to put it in the MachBackend instead -- if it's the case that it no longer depends on the call_conv. If it does on some backends (x64 might differentiate with fastcall?), we could statically construct the various options. That way, it's constructed once per compiler session, but without the lazy init.

cfallin avatar May 02 '24 22:05 cfallin

FWIW I ran this locally and on sightglass it reported a 1-3% slowdown in compiling spidermonkey.wasm, but no changes compiling other benchmarks. (I didn't re-run to see if that number was noise)

alexcrichton avatar May 10 '24 19:05 alexcrichton

@uweigand introduced the call_conv parameter on get_machine_env in #6957 with the intent to use it to have different sets of allocatable registers for the tail calling convention than for the system calling convention. We talked about that a little at the Cranelift meeting last week and it sounds to me like he still wants to use it that way, so I don't think we should remove it.

Regarding the main point of this PR though: I believe we can get rid of the need for OnceLock here by changing regalloc2 so MachineEnv holds a pair of PRegSet, which are small and const-evaluatable, instead of the six Vecs. Then a MachineEnv can be constructed in the initializer of a static item (note that Cranelift always sets fixed_stack_slots to an empty Vec, and Vec::new is a const fn). Then we can just return a borrow of a constant static in all of these cases, possibly choosing between several definitions based on the flags or calling convention.

jameysharp avatar May 12 '24 21:05 jameysharp

@uweigand introduced the call_conv parameter on get_machine_env in #6957 with the intent to use it to have different sets of allocatable registers for the tail calling convention than for the system calling convention. We talked about that a little at the Cranelift meeting last week and it sounds to me like he still wants to use it that way, so I don't think we should remove it.

It's not completely certain, depending on choice of frame pointer register, but in general it would be good to have this option. Allocatable registers in general may depend on calling convention.

uweigand avatar May 13 '24 15:05 uweigand