Missing cast for coerced local variable stores
When storing an int value into a local variable of a small int type (byte/short), the value is not truncated in the decompiled code and ends with a subtle behavior difference.
Input code
.assembly A { }
.class C {
.method public static int32 Trunc1(int32 x) {
.maxstack 2
.locals (uint16 tmp)
ldarg x
stloc tmp
ldloc tmp
ret
}
}
Erroneous output
//Actual output:
public static int Trunc1(int x) {
return x;
}
//Expected output:
public static int Trunc1(int x) {
return (ushort)x;
}
Details
ILSpy version 8.0.0.7106-preview2
.NET version 6.0.11-servicing.22523.4+943474ca16db7c65ba6cff4a89c3ebd219dde3e5
I might try to take a look on this over the weekend and send a PR if possible.
The C# compiler usually inserts conv instructions. Is this pattern generated by some compiler or are you writing IL yourself?
Yes it's handwritten IL, but I believe this code is allowed as per ECMA spec.
This is caused by: https://github.com/icsharpcode/ILSpy/blob/a3191f19e2d6ccf66fba596be9fb8013cef57b01/ICSharpCode.Decompiler/IL/Transforms/ILInlining.cs#L182-L185
Unfortunately disabling inlining in this case has lots of side-effects on other stuff; resulting in 103 failed tests.
For what is worth, I tried to implement a patch for this issue a while back but lost interest before getting it into a commitable state (sorry)... As far as I remember, it broke some tests for code using the dynamic keyword due to some issues around bool handling.
I tried a similar patch yesterday. The issue with dynamic is easily fixed; but there's more complicated issues in other cases.
Basically, a whole bunch of transforms need to start supporting this extra conv instruction; or we need some logic to detect when the conv is redundant, to eliminate it when possible. The IsImplicitTruncation call during inlining is insufficient for that, because often we inline a ldloc stacksloc for which no further information is available yet; simplification only becomes possible in later decompiler passes.
Basically, the following decisions need to be made:
When to introduce the conv?
- Already in ILReader: stop
stlocfrom "implicitly truncating", make it explicit in all passes - In ILInlining
- In ILInlining, but restricted to only the late StatementTransform version (similar to how named arguments are also restricted)
If it's already happening in ILReader, all transforms will need to potentially deal with the conv (but we'll generally tend to "fail safe" by not applying invalid transformations). If it's not happening in ILReader, many transforms will need IsImplicitTruncation (or IsSmallIntegerType) safety checks (e.g. not only inlining itself, but also ControlFlowSimplification.InlineVariableInReturnBlock and all other transforms effectively end up inlining a local). That makes me think we should prefer the ILReader variant, but that's probably also the largest change to the existing code base...
Do we simplify away the conv (e.g. in ExpressionTransforms), or try to handle it in the various transforms that are otherwise impacted?
It's noteworthy at this point that if the conv is not simplified away, the ExpressionBuilder will already do so, i.e. there won't be redundant casts to byte in the C# code. In fact, ExpressionBuilder is in a uniquely good position to do this because it has the accurate C# types available. Earlier passes need more guesswork to determine the possible value range. So the point of simplifying conv earlier would just be to fix other transforms broken by the additional conv.
I don't really have a full list of impacted transforms yet, but it involves at least:
dynamicis_True/is_Falsehere- logical simplification (e.g.
logic.not(conv.u1(logic.or(a, b)))) YieldReturnDecompiler