unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

Implement L0 cooperative kernel functions

Open 0x12CC opened this issue 1 year ago • 2 comments

Defines urKernelSuggestMaxCooperativeGroupCountExp and urEnqueueCooperativeKernelLaunchExp to enable cooperative kernels with more than one work group.

0x12CC avatar May 04 '24 02:05 0x12CC

SYCL tests for this PR are passing here: https://github.com/intel/llvm/pull/13653.

The current implementation of urEnqueueCooperativeKernelLaunchExp is nearly identical to urEnqueueKernelLaunch. It has some minor differences and calls zeCommandListAppendLaunchCooperativeKernel. I'm not sure if there's a better way to define it that reuses code or if the preference is to leave the implementations separate so that they can diverge in the future.

0x12CC avatar May 07 '24 17:05 0x12CC

It would be great to have some UR tests for cooperative kernels now that there is an implementation, is there a plan for that @0x12CC?

FYI that doesn't block this PR which is now in the merge queue.

kbenzie avatar May 22 '24 15:05 kbenzie