Detect mask usage for Across functions
Fixes: #101973
Optimises the pattern XAcross(ConditionalSelect(mask, a, 0))
-
AddAcross(conditionalSelect(mask, a, 0)) ✅ Signed → AddAcross(mask, a) ✅ Unsigned → AddAcross(mask, a)
-
AddSequentialAcross(a, conditionalSelect(mask, b, 0)) ✅ Signed → AddSequentialAcross(mask, b) ✅ Unsigned → AddSequentialAcross(mask, b)
-
❌ AddSequentialAcross(conditionalSelect(mask, a, 0), b)
-
OrAcross(conditionalSelect(mask, a, 0)) ✅ Signed → OrAcross(mask, a) ✅ Unsigned → OrAcross(mask, a)
-
XorAcross(conditionalSelect(mask, a, 0)) ✅ Signed → OrAcross(mask, a) ✅ Unsigned → OrAcross(mask, a)
-
MaxAcross(conditionalSelect(mask, a, 0)) ❌ Signed ✅ Unsigned → MaxAcross(mask, a)
-
MaxNumberAcross(conditionalSelect(mask, a, 0)) ❌ Signed No unsigned version available in the API. Takes only doubles or floats.
-
❌ AndAcross(conditionalSelect(mask, a, 0))
-
❌ MinAcross(conditionalSelect(mask, a, 0))
-
❌ MinNumberAcross(conditionalSelect(mask, a, 0))
@kunalspathak @a74nh @dotnet/arm64-contrib
6. MaxAcross(conditionalSelect(mask, a, 0))
This can probably be done with MaxAcross(conditionalSelect(mask, a, MAX_INT)). I'd leave that for a future PR.
also, seems there are failures in new test you added:
__tmp0_AcrossAndCselToAcross.cs:24:11: error: ARM64: expected string not found in input
//ARM64: {{^ *}} cmpne {{p[0-9]+.b}}, {{p[0-9]+/z}}, {{z[0-9]+.b}}, #0{{$}}
^
__jit_disasm.out:16:64: note: scanning from here
; BEGIN METHOD AcrossAndCselToAcross:addAcross_sbyte(byte,byte):System.Numerics.Vector`1[long]
^
__jit_disasm.out:26:2: note: possible intended match here
ldp fp, lr, [sp], #0x10
^
also, seems there are failures in new test you added:
__tmp0_AcrossAndCselToAcross.cs:24:11: error: ARM64: expected string not found in input //ARM64: {{^ *}} cmpne {{p[0-9]+.b}}, {{p[0-9]+/z}}, {{z[0-9]+.b}}, #0{{$}} ^ __jit_disasm.out:16:64: note: scanning from here ; BEGIN METHOD AcrossAndCselToAcross:addAcross_sbyte(byte,byte):System.Numerics.Vector`1[long] ^ __jit_disasm.out:26:2: note: possible intended match here ldp fp, lr, [sp], #0x10 ^
This is the issue that assembly checks are not scoped. They are tested for the entire method. Unfortunately, that mean when the test is skipped (!Sve.IsSupported), the assembly won't match.
Do you have any suggestions to work the assembly check on non-sve machines?
Dropped the optimisation for AddSequentialAcross and created an issue #109643 to track it.
Also removed the assembly checks as they cannot be tested in absence of an SVE enabled machine in the CI. Created an issue to add assembly check #109644.
/ba-g timeout infra issue.