David Yat Sin
David Yat Sin
Thanks. We started working on it, we will post an ETA as soon as possible.
This information comes from package: libdrm-amdgpu. @bigtrak @al42and @milthorpe, can you please check that you have a recent version of libdrm-amdgpu installed. libdrm usually stores this information here: /opt/amdgpu/share/libdrm/amdgpu.ids
It looks like it comes from this package: apt install libdrm-amdgpu-common And that will install the file here: /opt/amdgpu/share/libdrm/amdgpu.ids
> > > > I'm assuming that you are referring to the [`unpause_process()`](https://github.com/checkpoint-restore/criu/blob/5de9040ee758f1fd1a2599b6f800013544c966b6/plugins/amdgpu/amdgpu_plugin.c#L1498) call at the end of `amdgpu_plugin_dump_file()`. This functionality was introduced with commit [55a5993](https://github.com/checkpoint-restore/criu/commit/55a5993bc73a6d2e9551f275c78e0907c5dff686) and perhaps @dayatsin-amd might...
Overall, the general idea/concept of this refactor is fine.
What do you mean by terminate a kernel dispatch? Are you trying to cancel a kernel dispatch that was already enqueued?
When the process ends, then the dispatched kernels are terminated because the process is being destroyed inside the Linux OS Kernel, and this causes the queues to be unmapped. But...
Tracked internally with ticket: SWDEV-430447
Thanks for the patch. It seems I can use pthread_setaffinity_np so I removed the old calls to pthread_attr_setaffinity_np. The patch will be part of ROCm 6.1
Patch will be included in ROCm-6.1 release