llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[CUDA][HIP] Fix host task mem migration and add pi entry point for urEnqueueNativeCommandExp

Open hdelan opened this issue 1 year ago • 0 comments

The SYCL RT assumes that for devices in the same context, no mem migration needs to occur across devices for a kernel launch or host task. However, a CUdeviceptr is relevant to a specific device, so mem migration must occur between devices in a ctx. If this assumption that the SYCL RT makes about native mems being accessible to all devices in a context, it must hand off the HT lambda to the plugin, so that the plugin can handle the necessary mem migration.

This patch uses the new urEnqueueCustomCommandExp to execute the HT lambda, which takes care of mem migration implicitly in the plugin.

hdelan avatar Jun 28 '24 16:06 hdelan