runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Detect mask usage for Across functions

Open SwapnilGaikwad opened this issue 1 year ago • 4 comments

Fixes: #101973

Optimises the pattern XAcross(ConditionalSelect(mask, a, 0))

  1. AddAcross(conditionalSelect(mask, a, 0)) ✅ Signed → AddAcross(mask, a) ✅ Unsigned → AddAcross(mask, a)

  2. AddSequentialAcross(a, conditionalSelect(mask, b, 0)) ✅ Signed → AddSequentialAcross(mask, b) ✅ Unsigned → AddSequentialAcross(mask, b)

  3. ❌ AddSequentialAcross(conditionalSelect(mask, a, 0), b)

  4. OrAcross(conditionalSelect(mask, a, 0)) ✅ Signed → OrAcross(mask, a) ✅ Unsigned → OrAcross(mask, a)

  5. XorAcross(conditionalSelect(mask, a, 0)) ✅ Signed → OrAcross(mask, a) ✅ Unsigned → OrAcross(mask, a)

  6. MaxAcross(conditionalSelect(mask, a, 0)) ❌ Signed ✅ Unsigned → MaxAcross(mask, a)

  7. MaxNumberAcross(conditionalSelect(mask, a, 0)) ❌ Signed No unsigned version available in the API. Takes only doubles or floats.

  8. ❌ AndAcross(conditionalSelect(mask, a, 0))

  9. ❌ MinAcross(conditionalSelect(mask, a, 0))

  10. ❌ MinNumberAcross(conditionalSelect(mask, a, 0))

@kunalspathak @a74nh @dotnet/arm64-contrib

SwapnilGaikwad avatar Oct 25 '24 12:10 SwapnilGaikwad

6. MaxAcross(conditionalSelect(mask, a, 0))

This can probably be done with MaxAcross(conditionalSelect(mask, a, MAX_INT)). I'd leave that for a future PR.

a74nh avatar Oct 25 '24 15:10 a74nh

also, seems there are failures in new test you added:

__tmp0_AcrossAndCselToAcross.cs:24:11: error: ARM64: expected string not found in input
//ARM64: {{^ *}} cmpne {{p[0-9]+.b}}, {{p[0-9]+/z}}, {{z[0-9]+.b}}, #0{{$}}
^
__jit_disasm.out:16:64: note: scanning from here
; BEGIN METHOD AcrossAndCselToAcross:addAcross_sbyte(byte,byte):System.Numerics.Vector`1[long]
^
__jit_disasm.out:26:2: note: possible intended match here
ldp fp, lr, [sp], #0x10
^

kunalspathak avatar Oct 29 '24 01:10 kunalspathak

also, seems there are failures in new test you added:

__tmp0_AcrossAndCselToAcross.cs:24:11: error: ARM64: expected string not found in input
//ARM64: {{^ *}} cmpne {{p[0-9]+.b}}, {{p[0-9]+/z}}, {{z[0-9]+.b}}, #0{{$}}
^
__jit_disasm.out:16:64: note: scanning from here
; BEGIN METHOD AcrossAndCselToAcross:addAcross_sbyte(byte,byte):System.Numerics.Vector`1[long]
^
__jit_disasm.out:26:2: note: possible intended match here
ldp fp, lr, [sp], #0x10
^

This is the issue that assembly checks are not scoped. They are tested for the entire method. Unfortunately, that mean when the test is skipped (!Sve.IsSupported), the assembly won't match. Do you have any suggestions to work the assembly check on non-sve machines?

SwapnilGaikwad avatar Oct 29 '24 13:10 SwapnilGaikwad

Dropped the optimisation for AddSequentialAcross and created an issue #109643 to track it. Also removed the assembly checks as they cannot be tested in absence of an SVE enabled machine in the CI. Created an issue to add assembly check #109644.

SwapnilGaikwad avatar Nov 08 '24 16:11 SwapnilGaikwad

/ba-g timeout infra issue.

kunalspathak avatar Nov 18 '24 13:11 kunalspathak