cuda-api-wrappers
cuda-api-wrappers copied to clipboard
Rethink numerous copy functions vis-a-vis the copy parameters structure
With recent CUDA versions, we have the CUDA_MEMCPY3D_PEER struct, which is quite flexible. We also have a large host of copy functions - 40 all told - almost all of which are covered by a copy function taking this struct. And - we probably don't even cover all of its possibilities - w.r.t. inter-context copying.
We have (at least) two choices:
- Make sure the individual copy functions also cover the inter-context case, or
- Reduce, perhaps drastically, the number of copy functions, in favor of a copy builder
I'd appreciate some input from users of the library who have opined on design questions in the past, or have contributed code.