DCA Host to host copies trigger device synchronization.

Host to host copies trigger device synchronization.

Open gbalduzz opened this issue 6 years ago • 0 comments

All copies between Vector and Matrix objects are handled by cudaMemcpy with cudaMemcpyDefault. This kills our chances to execute CPU code in parallel with GPU memory transfers and kernel execution. Relevant code: include/dca/linalg/util/copy.hpp See: https://stackoverflow.com/questions/22430446/does-cuda-memcpy-from-host-to-host-perform-synchronization

Sep 17 '18 13:09 gbalduzz

DCA DCA copied to clipboard

Host to host copies trigger device synchronization.

DCA
DCA copied to clipboard