llvm icon indicating copy to clipboard operation
llvm copied to clipboard

[UR][L0v2] Add initial record & replay implementation

Open kswiecicki opened this issue 4 weeks ago • 3 comments

kswiecicki avatar Dec 09 '25 14:12 kswiecicki

@pbalcer, experimental header zex_graph.h is unavailable in when installing compute-runtime release package from https://github.com/intel/compute-runtime/releases. To avoid compilation failure involving unknown ze_graph_handle_t structure I've temporarily hardcoded compute-runtime fetching, which includes latest experimental headers. As a final solution, should I copy this header and paste it into the L0 adapters subdirectory?

kswiecicki avatar Dec 09 '25 14:12 kswiecicki

@lslusarczyk I've removed the UR_DFAILURE from ZE2UR_CALL_THROWS because it makes handling exception impossible when in debug mode.

kswiecicki avatar Dec 09 '25 14:12 kswiecicki

As a final solution, should I copy this header and paste it into the L0 adapters subdirectory?

Yes, I think we should do that for all experimental level-zero APIs we use.

pbalcer avatar Dec 09 '25 14:12 pbalcer

urEnqueueGraph tests are failing with Segmentation fault from GPU at 0x80d5ffa50000, ctx_id: 1 (CCS) type: 1 (WriteAccessViolation), level: 0 (PTE), access: 1 (Write), banned: 1, aborting. Those pass for me locally on Intel(R) Arc(TM) B580 Graphics 20.1.0.

kswiecicki avatar Dec 11 '25 11:12 kswiecicki

urEnqueueGraph tests are failing with Segmentation fault from GPU at 0x80d5ffa50000, ctx_id: 1 (CCS) type: 1 (WriteAccessViolation), level: 0 (PTE), access: 1 (Write), banned: 1, aborting. Those pass for me locally on Intel(R) Arc(TM) B580 Graphics 20.1.0.

This probably indicates that we need a newer driver in CI. For the time being, I suggest disabling the tests, and filing an issue to reenable it after update.

pbalcer avatar Dec 11 '25 11:12 pbalcer

Intel(R) Arc(TM) B580 Graphics 20.1.0

Done, I've turned off those tests and created an issue for this: https://github.com/intel/llvm/issues/20884.

kswiecicki avatar Dec 12 '25 08:12 kswiecicki

@intel/llvm-gatekeepers please consider merging

github-actions[bot] avatar Dec 12 '25 16:12 github-actions[bot]

SYCL Pre Commit on Linux test timeout was not caused by this PR.

Timed Out Tests (1):
  SYCL :: HostInteropTask/host-task-failure.cpp

kswiecicki avatar Dec 16 '25 13:12 kswiecicki