Remove `hostdevice_vector::element` due to unnecessary synchronization (Part 2 of miss-sync)
Description
For issue #18967, this PR is one part of merging the PR Draft #18968. In this PR, hostdevice_vector::element is removed due to its internal cudaMemcpy into host pageable memory. Also, the only call in it is replaced manually.
@mhaseeb123: Adding a DO NOT MERGE label for now as the removed stuff is needed to reproduce the compiler segfault issue: #18980
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
This pull request requires additional validation before any workflows can run on NVIDIA's runners.
Pull request vetters can view their responsibilities here.
Contributors can view more details about this message here.
Hi @mhaseeb123 , just wanted to check if any update on the bug that’s currently blocking the merge of this pull request? Thanks!
Hi @mhaseeb123 , just wanted to check if any update on the bug that’s currently blocking the merge of this pull request? Thanks!
Unfortunately, I don't see any updates on the page. @GregoryKimball @vuule should we just move ahead with this PR and refer NVBug to use libcudf branch-25.08 (before this PR's commit) to reproduce the bug.
Thanks for the update! I’m happy to either leave it as is (for NVbug bookkeeping) or proceed with the merge—whichever option aligns with what we discussed.
I would be okay with letting this PR move along. Should not be too hard to recreate the repro even without the element function.
Alrighty then, let's review this and merge
/ok to test 8dc770e
Thanks for approving it and updating the PR title!
/ok to test 729a8e9
/merge