KernelAbstractions.jl
KernelAbstractions.jl copied to clipboard
`shfl_down` intrinsic
This may be all that's needed?
Could maybe add simdgroup (warps, subgroups) indexing intrinsics but I'd have to check if every backend supports this (I assume they would?)
Your PR requires formatting changes to meet the project's style guidelines.
Please consider running Runic (git runic main) to apply these changes.
Click here to view the suggested changes.
diff --git a/test/intrinsics.jl b/test/intrinsics.jl
index 68fa9e48..d27de5e9 100644
--- a/test/intrinsics.jl
+++ b/test/intrinsics.jl
@@ -36,10 +36,10 @@ function shfl_down_test_kernel(a, b)
value = temp[idx]
value = value + KI.shfl_down(value, 16)
- value = value + KI.shfl_down(value, 8)
- value = value + KI.shfl_down(value, 4)
- value = value + KI.shfl_down(value, 2)
- value = value + KI.shfl_down(value, 1)
+ value = value + KI.shfl_down(value, 8)
+ value = value + KI.shfl_down(value, 4)
+ value = value + KI.shfl_down(value, 2)
+ value = value + KI.shfl_down(value, 1)
b[idx] = value
end
@@ -152,7 +152,7 @@ function intrinsics_testsuite(backend, AT)
dev_a = AT(a)
dev_b = AT(zeros(T, 32))
- KI.@kernel backend() workgroupsize=32 shfl_down_test_kernel(dev_a, dev_b)
+ KI.@kernel backend() workgroupsize = 32 shfl_down_test_kernel(dev_a, dev_b)
b = Array(dev_b)
@test sum(a) ≈ b[1]
So the backends that I am worried about is Metal and to a lesser extend Intel.
So the backends that I am worried about is Metal and to a lesser extend Intel.
What are your worries?