KernelAbstractions.jl icon indicating copy to clipboard operation
KernelAbstractions.jl copied to clipboard

specify ngroups as alternative to ndrange

Open simonbyrne opened this issue 4 years ago • 3 comments

The ClimateMachine kernels often specify the ndrange as an integer multiple of the workgroup, e.g.: https://github.com/CliMA/ClimateMachine.jl/blob/feacd0238343cdf268ae660e56b301f487689efc/src/Numerics/DGMethods/DGModel.jl#L1241-L1242

In this case, it would be simpler to be able to specify something like ngroups=info.nrealelem

simonbyrne avatar Jan 07 '21 05:01 simonbyrne

I have been thinking about this.

The reason I went with ndrange over ngroups, was that traditional CUDA kernels need to include bounds-checks, and the hope was that we could do something clever like disabling them on the CPU or only running the bounds-checks on the border of the grid.

I have since seen folks being confused about the semantics of ndrange (partly because the documentation is so sparse), and I was thinking of removing it entirely from KernelAbstractions and replacing it with ngroups, but I am worried that for non-CLIMA kernels this might be problematic, since doing so will prevent us from enabling vectorization.

Adding both might work

vchuravy avatar Jan 07 '21 17:01 vchuravy

In this case, would there be any downside to writing

workgroup = (info.Nq[1], info.Nq[2], 1)
ndrange = (info.Nq[1], info.Nq[2], info.nrealelem)

and then in the kernel

i, j, e = @index(Global, NTuple)

?

simonbyrne avatar Jan 07 '21 17:01 simonbyrne

No that should be fine.

vchuravy avatar Jan 07 '21 18:01 vchuravy