KernelAbstractions.jl
KernelAbstractions.jl copied to clipboard
specify ngroups as alternative to ndrange
The ClimateMachine kernels often specify the ndrange as an integer multiple of the workgroup, e.g.:
https://github.com/CliMA/ClimateMachine.jl/blob/feacd0238343cdf268ae660e56b301f487689efc/src/Numerics/DGMethods/DGModel.jl#L1241-L1242
In this case, it would be simpler to be able to specify something like ngroups=info.nrealelem
I have been thinking about this.
The reason I went with ndrange over ngroups, was that traditional CUDA kernels need to include bounds-checks, and the hope was that we could do something clever like disabling them on the CPU or only running the bounds-checks on the border of the grid.
I have since seen folks being confused about the semantics of ndrange (partly because the documentation is so sparse), and I was thinking of removing it entirely from KernelAbstractions and replacing it with ngroups, but I am worried that for non-CLIMA kernels this might be problematic, since doing so will prevent us from enabling vectorization.
Adding both might work
In this case, would there be any downside to writing
workgroup = (info.Nq[1], info.Nq[2], 1)
ndrange = (info.Nq[1], info.Nq[2], info.nrealelem)
and then in the kernel
i, j, e = @index(Global, NTuple)
?
No that should be fine.