HIP
HIP copied to clipboard
[Issue]: Asynchronous execution with hipExtModuleLaunchKernel
Problem Description
It is my understanding that by passing hipExtAnyOrderLaunch
as the last argument to this entry point: hipExtModuleLaunchKernel, I could achieve asynchronous execution of the kernels that I'm dispatching.
So I can have a single hipStream_t
to which I dispatch my kernels by calling hipExtModuleLaunchKernel
, with the above flag for each kernel and they will execute asynchronously, is that correct?
I've been experimenting with it but couldn't achieve this behaviour. I used a single nonBlocking
stream but all the kernels I launched with the above entry point were executed synchronously, despite setting the required flag to 1
. I inspected that using rocprof
and https://ui.perfetto.dev/ as GUI to check if the kernels execute async.
Would you be able to provide me with example of how to use this particular feature to achieve concurrency in a single stream? And how to profile it to see the correct behaviour? Thank you!
Operating System
Ubuntu
CPU
AMD EPYC 7763 64-Core Processor
GPU
AMD Instinct MI210
ROCm Version
ROCm 6.0.0
ROCm Component
No response
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response