KernelAbstractions.jl icon indicating copy to clipboard operation
KernelAbstractions.jl copied to clipboard

`shfl_down` intrinsic

Open christiangnrd opened this issue 1 month ago • 3 comments

This may be all that's needed?

Could maybe add simdgroup (warps, subgroups) indexing intrinsics but I'd have to check if every backend supports this (I assume they would?)

christiangnrd avatar Nov 22 '25 02:11 christiangnrd

Your PR requires formatting changes to meet the project's style guidelines. Please consider running Runic (git runic main) to apply these changes.

Click here to view the suggested changes.
diff --git a/test/intrinsics.jl b/test/intrinsics.jl
index 68fa9e48..d27de5e9 100644
--- a/test/intrinsics.jl
+++ b/test/intrinsics.jl
@@ -36,10 +36,10 @@ function shfl_down_test_kernel(a, b)
         value = temp[idx]
 
         value = value + KI.shfl_down(value, 16)
-        value = value + KI.shfl_down(value,  8)
-        value = value + KI.shfl_down(value,  4)
-        value = value + KI.shfl_down(value,  2)
-        value = value + KI.shfl_down(value,  1)
+        value = value + KI.shfl_down(value, 8)
+        value = value + KI.shfl_down(value, 4)
+        value = value + KI.shfl_down(value, 2)
+        value = value + KI.shfl_down(value, 1)
 
         b[idx] = value
     end
@@ -152,7 +152,7 @@ function intrinsics_testsuite(backend, AT)
             dev_a = AT(a)
             dev_b = AT(zeros(T, 32))
 
-            KI.@kernel backend() workgroupsize=32 shfl_down_test_kernel(dev_a, dev_b)
+            KI.@kernel backend() workgroupsize = 32 shfl_down_test_kernel(dev_a, dev_b)
 
             b = Array(dev_b)
             @test sum(a) ≈ b[1]

github-actions[bot] avatar Nov 22 '25 02:11 github-actions[bot]

So the backends that I am worried about is Metal and to a lesser extend Intel.

vchuravy avatar Nov 22 '25 03:11 vchuravy

So the backends that I am worried about is Metal and to a lesser extend Intel.

What are your worries?

christiangnrd avatar Nov 22 '25 20:11 christiangnrd