gdrcopy
gdrcopy copied to clipboard
thinking about working with CUDA async API
hi, Thinking about gdrcopy API working with cuda async API? for many scenarios, cuMemcpyAsync or other async API are used often. If gdrcopy copy APIs work with cuStreamCallback, I don't think the performance will work very well, but async mode is a big requirement . what do you think?