ginkgo (CUDA) kernels inside namespace or suffix `

In most cases, we append the suffix _kernel to a function to identify it as a function which is not executed on the host machine (currently only CUDA code). However, there are instances where we use a namespace kernel around these functions (as far as I can tell, this is happens in cuda/matrix/dense_kernels.cu, cuda/components/zero_array.cu and cuda/preconditioner/jacobi_kernels.cu).

What do you think about that, @ginkgo-project/developers ?

We should add it to the Contributing guidelines when we have decided on the matter.

Jan 22 '19 13:01 thoasm

Personally, I prefer the suffix _kernel, so it is clear at every call and by looking at the function directly (namespace would require to look at the neighborhood of the function as well) that this is a special function and a call will launch it on a different device.

But I don't mind the alternative either, I would just like to have it consistent.

Jan 22 '19 14:01 thoasm

Just for completeness, jacobi_kernels.cu and zero_array.cu also use the namespace kernel.

Jan 22 '19 14:01 tcojean

@tcojean True, I guess my search was not thorough enough. Updated the issue description.

Jan 22 '19 14:01 thoasm

The namespace kernel is also used in all executor files omp/, reference/ as well as cuda/. And only in the cuda/ folder, some of the functions have the suffix _kernel. There we specifically dont have sub-functions to add the _kernel suffix (atleast not yet). So maybe the namespace is a good idea ?

Jan 22 '19 14:01 pratikvn

@pratikvn Really, where? Every part is inside a namespace kernels, after which a namespace omp, reference or cuda comes. But I did not find any function encapsulated inside a namespace kernel in the others.

Jan 22 '19 14:01 thoasm

Every part is inside a namespace kernels, after which a namespace omp, reference or cuda comes

That is what I meant. And I think that is enough ? In cuda/matrix/dense_kernels.cu and others we have functions are separately in the namespace and that is confusing.

Jan 22 '19 14:01 pratikvn

The question is how we should encapsulate actual CUDA kernels (that have to be called with <<<grid, dim, ...>>>) in order to associate them with the global function.

For example if you have a function initialize from an algorithm, we name the associated CUDA kernel (which is called from initialize) either initialize_kernel or put it inside an additional namespace kernel::initialize.

Jan 22 '19 14:01 thoasm

We might actually need the namespace, so we can also put the __device__ functions in there.

Jan 22 '19 16:01 thoasm

ginkgo
ginkgo copied to clipboard

(CUDA) kernels inside namespace or suffix `_kernel`

ginkgo ginkgo copied to clipboard

(CUDA) kernels inside namespace or suffix `_kernel`

ginkgo
ginkgo copied to clipboard