numba-dpex icon indicating copy to clipboard operation
numba-dpex copied to clipboard

Support non-uniform workgroups in Numba-dpex

Open diptorupd opened this issue 2 years ago • 1 comments

Currently numba-dpex does not allow submitting kernels with non-uniform work groups, i.e., where the local work-group sizes are not integer factors of the global work-group size. We need to explore if the restriction can be removed or at least made less restrictive.

For reference, there is an OCL extension from ARM https://registry.khronos.org/OpenCL/extensions/arm/cl_arm_non_uniform_work_group_size.txt that may help overcome the issue that we can look at. Worth discussing with the IGC team.

diptorupd avatar Jan 30 '23 06:01 diptorupd

If I may suggest something there: a useful addition could be to expose a keyword that automatically adjust each dimension of the global size to the nearest greatest multiple of the corresponding dimension of the local size. Currently I do it repetitively in each kernel:

global_size = math.ceil(n_work_items / work_group_size) * work_group_size

fcharras avatar Jan 30 '23 10:01 fcharras