LevelZero UR Adapter Release Fails
Describe the bug
We have started to observe flaky behavior in a workflow on the CUTLASS SYCL repository.
The workflow in question uses nightly builds of DPC++, the first failure was observed on 2025-05-14 with the nightly release from the same date.
The workflow tests the CUTLASS Python interface. In the SYCL implementation, the DPCTL framework is used to interface with the SYCL runtime from Python code.
To aid investigation, we have added SYCL_UR_TRACE=2 to the workflow. Due to that, on the latest failure, we could trace the observed error back to a failed release of the LevelZero UR adpter:
terminate called after throwing an instance of 'sycl::_V1::exception'
---> urAdapterRelease
what(): Native API failed. Native API returns: 37 (UR_RESULT_ERROR_UNINITIALIZED)
<--- urAdapterRelease(.hAdapter = 0x560130490760) -> UR_RESULT_ERROR_UNINITIALIZED;
/home/runner/actions-runner/_work/_temp/e6ff9a6b-8d76-49d8-8aee-ab852c8279b9.sh: line 5: 68971 Aborted
Full log is available here.
To reproduce
See the full log above or contact issue reporter.
Environment
- OS: Linux
- Target device and vendor: Intel Data Center GPU Max 1100
- DPC++ version: Nightly release 2025-05-14 (first version with which the behavior was observed).
Additional context
No response
@kbenzie mentioned that the UMF tag bump in https://github.com/intel/llvm/pull/18378 could help.
Which CPU are you using? @sommerlukas
@kbenzie mentioned that the UMF tag bump in #18378 could help.
This has just merged so should be part of the next nightly build.
#18378 has not resolve this issue.
I think this has the hallmarks of attempting to release an already released adapter, hence UR_RESULT_ERROR_UNINITIALIZED coming from the loader since adapter function pointer being called is nullptr.
This is potentioally related to URT-931 and/or URT-939.