llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[SYCL][UR] Implement sycl_ext_oneapi_device_wait

Open steffenlarsen opened this issue 3 months ago • 1 comments

This commit implements the UR functionality for device-wide synchronization and the SYCL APIs using it. The latter implements the sycl_ext_oneapi_device_wait extension.

steffenlarsen avatar Oct 16 '25 12:10 steffenlarsen

Probably more of a question to the spec but also would affect tests so I'll ask here. Is the behavior well defined

  • In the presence of L0 interop
  • With tasks running on sub/parent devices when waiting on parent/sub ?

aelovikov-intel avatar Nov 25 '25 21:11 aelovikov-intel

Probably more of a question to the spec but also would affect tests so I'll ask here. Is the behavior well defined

  • In the presence of L0 interop
  • With tasks running on sub/parent devices when waiting on parent/sub ?

Tag @gmlueck.

I am not certain there is any need to make note for L0 interop, but I agree that parent-/sub-device synchronization may be useful to have explicit behavior documented for.

steffenlarsen avatar Dec 15 '25 07:12 steffenlarsen

I am not certain there is any need to make note for L0 interop, but I agree that parent-/sub-device synchronization may be useful to have explicit behavior documented for.

This is a good point. The users asking for this feature want parity with CUDA, but CUDA has no concept of sub-devices or parent devices. I think it would make sense for this API to wait only for commands submitted to the specific device (not for commands submitted to the parent or sub- devices). However, we also need to consider what can be implemented in the backend.

Do we know how the Level Zero API behaves w.r.t. sub-devices / parent devices?

gmlueck avatar Dec 15 '25 15:12 gmlueck

@intel/llvm-reviewers-runtime & @intel/dpcpp-tools-reviewers - Linux drivers now support the device-wide synchronization, so this patch is ready for review! 🥳

steffenlarsen avatar Dec 17 '25 11:12 steffenlarsen