gdrcopy
gdrcopy copied to clipboard
Improve the error report of gdrcopy_pplat when the CUDA kernel cannot be launched
gdrcopy_pplat may be compiled with one CUDA TK but run with a different version. Although the CUDA JIT compiler should take care of converting PTX to the target version, it fails in some situations. Currently we get the following error, which is indistinguishable from other errors.
...
CPU does gdr_copy_to_mapping and GPU writes back via cuMemHostAlloc'd buffer.
Running 1000 iterations with data size 4 bytes.
Assertion "(cuStreamQuery(0)) == (CUDA_ERROR_NOT_READY)" failed at pplat.cu:257