unified-runtime icon indicating copy to clipboard operation
unified-runtime copied to clipboard

Info queries for interop objects fail silently

Open alexbatashev opened this issue 1 year ago • 3 comments

Objects that have been created with interop APIs often lack important pieces of information, required for info queries. For example, urEventCreateWithNativeHandle expects a native CUDA handle, and the resulting event handle will lack a reference to a UR queue. However, urEventGetInfo has no checks for whether an object was created with a native API or not, neither it will return an error in that case. Instead the returned queue handle will be invalid, i.e. == nullptr:

https://github.com/oneapi-src/unified-runtime/blob/2b7b827cfb8c87d84039188ed85d4e7d9fb738de/source/adapters/cuda/event.cpp#L160-L183

alexbatashev avatar Jan 09 '24 09:01 alexbatashev

Expect that this also applies to HIP, but not OpenCL, initial thought on Level Zero is that this should issue should not apply but need to be confirmed.

Possible solutions:

  • Could track events created with handles and return an error on queries that cannot be supported.
  • Could always fully initialize the event, but is this possible?

alycm avatar Jan 10 '24 15:01 alycm

Expect that this also applies to HIP, but not OpenCL, initial thought on Level Zero is that this should issue should not apply but need to be confirmed.

Yes, HIP is also affected by this issue. L0 is not a target of interest for me right now, so I didn't check.

Could track events created with handles and return an error on queries that cannot be supported.

Actually, the kind of information I needed to extract from an event is the associated platform backend. This kind of info query seems to be trivial to implement for any of the handles, but that should be discussed separately.

Returning an error (while not a perfect solution) is still much better than silently returning invalid handles.

Could always fully initialize the event, but is this possible?

I remember early implementations of DPC++ backend interop APIs accepted a custom structure with a few more data fields to be able to sufficiently construct the object. ~~I'll try to find a PR for a reference.~~ (UPD: here) This would solve some other issues, like how to capture profiling info in the UR events. However, I suspect, that it's not always possible to provide a reference to a queue when integrating with an external system.

alexbatashev avatar Jan 12 '24 13:01 alexbatashev

FYI, there is a related issue https://github.com/intel/llvm/issues/13706: for interop events, the UR_EVENT_INFO_COMMAND_EXECUTION_STATUS does not behave correctly, treating the event as never recorded based on default-initialized IsRecorded value, regardless of the actual status of the event (if it was recorded externally).

al42and avatar May 31 '24 18:05 al42and