cudf
cudf copied to clipboard
Increase multibyte_split performance by improving copy semantics.
reduces unnecessary copying by moving copy logic in to the data_chunk_reader
API, introducing a new method copy_next_chunk
, which on a per-implementation-basis can decide the best way to copy the underlying data source in to the required specific destination. This API is design to allow single-copy operations where using get_next_chunk
may otherwise perform a copy under the hood before the data reaches it's final destination. Consumers should prefer get_next_chunk
when the location of the data is not important, allowing device reads to take place in a zero-copy manner.
Thanks @ttnghia for the reminder. :)
before:
after:
Codecov Report
:exclamation: No coverage uploaded for pull request base (
branch-22.10@6d38240
). Click here to learn what that means. The diff coverage isn/a
.
:exclamation: Current head 0a9cfd4 differs from pull request most recent head fd5f28c. Consider uploading reports for the commit fd5f28c to get more accurate results
@@ Coverage Diff @@
## branch-22.10 #11361 +/- ##
===============================================
Coverage ? 86.43%
===============================================
Files ? 144
Lines ? 22808
Branches ? 0
===============================================
Hits ? 19714
Misses ? 3094
Partials ? 0
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 6d38240...fd5f28c. Read the comment docs.
Rerun tests.
@cwharris as this no longer applies, do you think the semantics would be useful elsewhere?
@upsj yeah, this is something I wanted to add to the data chunk reader from the beginning, I just didn't have a use case for it. Anywhere we need to copy bytes in to a specific location, this is the right API. Otherwise we could be performing an extra copy. That said it has no place in the recently merged multibyte_split
implementation.