wasmtime icon indicating copy to clipboard operation
wasmtime copied to clipboard

cranelift: Sign extend immediates in instructions that embed them.

Open afonso360 opened this issue 2 years ago • 5 comments

👋 Hey,

This PR alters the behaviour of *_imm instructions to sign extend their immediate argument when the control type is i128.

This comes from a fuzzer issue where the interpreter was sign extending the immediates but the legalizations were not.

Fixes: #4568 Fixes: #4641

afonso360 avatar Aug 03 '22 22:08 afonso360

As I stated in the linked issue, IMHO the CLIF instruction definition should be updated.

akirilov-arm avatar Aug 04 '22 14:08 akirilov-arm

As I stated in the linked issue, IMHO the CLIF instruction definition should be updated.

You are right, I didn't read your comment properly (sorry!), ill mark this as draft until we make that decision.

afonso360 avatar Aug 04 '22 14:08 afonso360

@akirilov-arm I've updated the documentation, is this all you meant, or did I miss something else?

Also, cc @cfallin since I can't request reviews.

afonso360 avatar Aug 09 '22 19:08 afonso360

I also think zero extending the unsigned operations is probably the better choice.

However, what if we instead altered the type of the instruction to accept the full Imm128?

I don't think this would grow the size of InstructionData since we already have Shuffle which also stores a Imm128. However Shuffle stores the immediate in the DataFlowGraph, we could do the same, and in this legalization pass emit the appropriate iconst / iconcat / sextend depending on what the constant is.

Would storing the Imm in the DFG for these operations cause performance issues?

afonso360 avatar Aug 10 '22 13:08 afonso360

Actually that would be the ideal design: from first principles one would expect "with immediate" variants of 128-bit ops to carry 128-bit (16-byte) immediates.

However we can't carry that inline in the instruction, as you're hinting, because InstructionData is 16 bytes total (and we want to keep it that way).

I suspect that indirecting for all constants would have some nontrivial overhead, though it might be a bit of work to measure (a lot of refactoring).

I had another thought, though, that may be more "honest" about the costs: what if we limited (i) op_imm instructions and (ii) the iconst instruction to work for up to 64 bits only? Producers that want to emit full 128-bit values already produce two iconsts and iconcat them. In some sense, trying to allow iconst.i128 -1 or the op_imm folded variant is lulling producers into a false sense of what's supported. As soon as one writes iconst.i128 -0x8000_0000_0000_0001 one finds out that there are only half as many bits in the immediate field as the type otherwise implies.

Thoughts on that?

cfallin avatar Aug 10 '22 17:08 cfallin

I suspect that indirecting for all constants would have some nontrivial overhead, though it might be a bit of work to measure (a lot of refactoring).

I was proposing that only for *_imm variants. The way I see it, these instructions are already just sort of "helpers" in the sense that they just construct a pattern that we recognize (i.e. iadd + iconst), they could just emit more advanced patterns to support i128 (i.e. iadd+iconst+sext or iadd+iconst+iconcat all in 64bits).

Having more operations only work up to i64 to me feels like i128 is a second class citizen, which is sort of true for the current backends, but I don't really like it at the lang level.

I don't know, I'm a bit undecided on this, I'd like to hear what other people think.

afonso360 avatar Aug 10 '22 19:08 afonso360

Generally I agree at an IR-design level: it would be nice to have orthogonality here, and allow all types up through i128 in all places where integers are allowed.

This compromise feels similar to "small immediates" in RISC ISAs though: we have a relatively small struct size for instructions, and its size is critical for performance, and its entire width would be taken by just the i128 immediate (hence it would need to grow). So in that sense I don't think we can have an inline Imm128.

I would be curious to see the overhead of a "every immediate is indirect" approach, if you're up for prototyping that. If it's actually negligible, then I think it's a reasonable change to make.

cfallin avatar Aug 10 '22 19:08 cfallin

I would be curious to see the overhead of a "every immediate is indirect" approach, if you're up for prototyping that. If it's actually negligible, then I think it's a reasonable change to make.

Me too, and I think we could mitigate some of the cost (if it is non-negligible) by having two const InstructionData variants, one inline up to 64bits and one in the DFG for larger values, and choose between them when the user emits a iconst (can we do this?).

Although right now I don't have a lot of time to try this out.

I'm planning on addressing the feedback above and getting this merged, is that okay, or would you prefer restricting the types that we accept?

afonso360 avatar Aug 11 '22 17:08 afonso360