cudf
cudf copied to clipboard
Use cudaMemcpyBatchAsync
Description
This updates libcudf to use cudaMemcpyBatchAsync on supported systems (CUDA 12.8+). This API can be lower-overhead and may reduce multi-thread contention compared to cudaMemcpyAsync.
TODO:
- [ ] Fix errors -- some tests need synchronizations where we were missing them before
- [ ] Document that we prefer to use cudf utilities for memcpy inside libcudf
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [ ] New or existing tests cover these changes.
- [ ] The documentation is up to date with these changes.
Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.
Contributors can view more details about this message here.