ParallelStencil.jl
ParallelStencil.jl copied to clipboard
Unsupported keyword argument 'inbounds' with CUDA.jl
When I try to use the inbounds keyword with CUDA kernels I get the following Unsupported keyword argument 'inbounds' error:
using CUDA
using ParallelStencil
@init_parallel_stencil(CUDA, Float64, 2)
n = 1024
A = @rand(n, n)
B = @zeros(n, n)
@parallel_indices (I...) function foo!(A, B)
A[I...] = B[I...]
nothing
end
julia> @parallel inbounds=true (1:n, 1:n) foo!(A, B)
ERROR: LoadError: ArgumentError: Unsupported keyword argument 'inbounds'
Stacktrace:
[1] var"@cuda"(__source__::LineNumberNode, __module__::Module, ex::Vararg{Any})
@ CUDA ~/.julia/packages/CUDA/1kIOw/src/compiler/execution.jl:57
while this seems to work just fine on the CPU backend. Any chance I am misusing it?
This keyword argument is for function definitions, not function calls (see ?@parallel).
In a function call, it will simply pass it further as an unknown keyword argument to the gpu backend. In the CPU case, unknown keyword arguments are assumed to be for the GPU backend and are simply ignored. This is why it appeared to be working in the CPU case for you.
That said, you can call @inbounds @parallel (1:n, 1:n) foo!(A, B), but at least as of now I don't think it has any effect in the GPU case.