runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Add asynchronous copy

Open PearCoding opened this issue 2 years ago • 4 comments

Add asynchronous copy operation anydsl_copy_async.

The "async" is only a hint and only works on CUDA and OpenCL. Did not find a suitable method for HSA. CPU could have async, but usually the host is handled as a single unit without async capabilities, therefore it was not added intentionally.

Tested with Rodent (Artic).

PearCoding avatar Sep 06 '23 15:09 PearCoding

If the copy is asynchronous, how do you know it's finished ? Device-wide barrier ?

Hugobros3 avatar Sep 09 '23 17:09 Hugobros3

Yes. Unfortunately, there is no access to streams or other finer-grade barriers in the API. Having a common set between all the device types we support is quite difficult. Especially because of OpenCL. :/

If you have an idea for finer-grade barriers, feel free to mention it. I am very interested in that :D

PearCoding avatar Sep 11 '23 09:09 PearCoding

For HSA, you can use hsa_amd_memory_async_copy on AMD GPUs.

richardmembarth avatar Sep 13 '23 12:09 richardmembarth

The hsa function requires signals (which might be useful for events [other PR]). What would be the best practice to provide them for each call without exposing it to the AnyDSL user? Having a platform / device specific list of current signals?

PearCoding avatar Sep 13 '23 15:09 PearCoding