Add a way to translate from/to ZES/ZE device handle when using zesInit()
Followup to https://github.com/intel/compute-runtime/issues/686#issuecomment-1805025932
We need a way to translate from/to ZE/ZES device handles when using zesInit(). This is needed to be able to query some device info from ZES and combine it with ZE info. It was reported under work but I still cannot find it and couldn't get a reply on the specification/implementation status.
hi @bgoglin we have added APIs to map based on UUID of device in spec 1.9 https://spec.oneapi.io/level-zero/latest/sysman/api.html#sysmandevicemapping and implementation is under work in progress and will expect to avaliable by June 2024 end
hi @bgoglin we have added APIs to map based on UUID of device in spec 1.9 https://spec.oneapi.io/level-zero/latest/sysman/api.html#sysmandevicemapping and implementation is under work in progress and will expect to avaliable by June 2024 end
Hello. Can you update me regarding the implementation status? If done, which releases started having it?
hi @bgoglin based on UUID , we could map from core device handle to sysman device handle but its not directly casting from core device to sysman device, we are not planning to support casting from core to sysman and vice versa which is not possible. Please read Sysman Initialization https://github.com/intel/compute-runtime/blob/master/programmers-guide/SYSMAN.md
@bgoglin also please go through https://oneapi-src.github.io/level-zero-spec/level-zero/latest/sysman/api.html#sysmandevicemapping-functions APIs for mapping from core device handle -> sysman device handle
please find below some steps about mapping core device handle to sysman device handle and vice versa when using zesInit().
Mapping core handle to sysman handle:
- Get UUID of core handle from "zeDeviceGetProperties(ze_device_handle_t hDevice, ze_device_properties_t *pDeviceProperties)". (note down ze_device_properties_t.uuid)
2.please pass UUID along with sysman driver handle to "zesDriverGetDeviceByUuidExp(zes_driver_handle_t hDriver, zes_uuid_t uuid, zes_device_handle_t *phDevice, ze_bool_t *onSubdevice, uint32_t *subdeviceId)" to get sysman handle as "phDevice" equivalent to core handle. Note: hDriver should be got from zesDriverGet().
Mapping sysman handle to core handle:
-
Get UUID of the sysman device handle from "zesDeviceGetProperties(zes_device_handle_t hDevice, zes_device_properties_t *pProperties)" for sysman device and if zes_device_properties_t.numSubdevices > 0, then get UUIDs of sub devices using "zesDeviceGetSubDevicePropertiesExp(zes_device_handle_t hDevice, uint32_t *pCount, zes_subdevice_exp_properties_t *pSubdeviceProps)".
-
Using the UUID of sysman device handle of interest, it is possible to compare UUIDs of the core device handles got from "zeDeviceGetProperties(ze_device_handle_t hDevice, ze_device_properties_t *pDeviceProperties)" to get the mapping core device handle.
I am trying to test this but I don't have any machine with subdevices. There are some PVC on endeavour but I am getting ZE_RESULT_ERROR_UNSUPPORTED_FEATURE from zesDriverGetDeviceByUuidExp(). Assuming this is returned by libze_loader.so, then I guess the level-zero package is too old (1.17.6-i950)? I have 1.17.44 on my laptop, things work fine but I don't have subdevices there. If you tell me what to upgrade and to which version, I'll ask the endeavour admins.
I am closing this since things work in hwloc now. However, I'd still like an answer to the question above about which version first supported zesDriverGetDeviceByUuidExp(). Overall, I really dislike the idea of adding a no-op function to respect the spec and later actually implement it. This has been a nightmare for us several times in the past.
Hi @bgoglin , you can find the zesDriverGetDeviceByUuidExp() API supported from libze_intel_gpu.so.1.x.xxxxx where xxxxx is greater than 30220.
Hi @bgoglin , you can find the zesDriverGetDeviceByUuidExp() API supported from libze_intel_gpu.so.1.x.xxxxx where xxxxx is greater than 30220.
Any idea where to find RHEL (9.4) packages with this? The doc [1] currently points to some "lts/2350" RPM repository which contains a much older libze_intel_gpu.so.1.3.27642.66
[1] https://dgpu-docs.intel.com/driver/installation.html