HIP
HIP copied to clipboard
Multi-Device called by multi cpu thread
Could ROCm consider non-lock hipMemcpyAsync API when each thread control a device?
#pragma omp for
for (deviceId = 0; deviceId < deviceSiz; ++deviceId)
{
hipSetDevice(deviceId);
hipMemcpyAsync(...);
}
I know there is spinlock inside hipMemcpyAsync API, but when set hipSetDevice, the spinlock will be removed?
@stabunkow Apologies for the lack of response. Do you still need assistance with this ticket? Thanks!