cuvs icon indicating copy to clipboard operation
cuvs copied to clipboard

[REVIEW][Java] One PinnedMemoryBuffer per CuVSResourcesImpl

Open ldematte opened this issue 2 months ago • 9 comments

While profiling cuvs-java, we found that allocating a PinnedMemoryBuffer for each host->device or device->host memory copy was unnecessary and wasteful. This PR moves the allocation of a PinnedMemoryBuffer to CuVSResourcesImpl, so that the buffer can be cached and reused. Since CuVSResources are already meant to be per-thread, this is safe, as the PinnedMemoryBuffer will never be used concurrently. In order to do it cleanly, we introduced two named ScopedAccess classes and a helper method that will always find its way to the internal MemorySegment used by native functions to access the buffer, without the need to expose it via the public interface.

ldematte avatar Oct 21 '25 12:10 ldematte

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

copy-pr-bot[bot] avatar Oct 21 '25 12:10 copy-pr-bot[bot]

/ok to test aa5e469

mythrocks avatar Oct 28 '25 20:10 mythrocks

I've rebased this PR and resolved a conflict.
But it looks like we're going to need to fix the copyright headers on the changed files.

mythrocks avatar Oct 28 '25 22:10 mythrocks

/ok to test d7fcdb3

benfred avatar Oct 28 '25 23:10 benfred

/ok to test c4f0570

benfred avatar Oct 28 '25 23:10 benfred

/ok to test d792fce

benfred avatar Oct 29 '25 19:10 benfred

/ok to test d792fce

@benfred: This PR will likely continue to fail CI until the copyright headers are fixed for this change. Plus, there's another change pending (for handling cases where the row is super-large).

mythrocks avatar Oct 29 '25 20:10 mythrocks

@mythrocks the copyright issue should be fixed with c4f0570 (but CI is still failing here on CagraMultiThreadStabilityIT.testQueryingUsingMultipleThreadsWithPrivateResources:62->testQueryingUsingMultipleThreads:162 MultiThreaded stablity test failed: null errors in the java unittests =(

benfred avatar Oct 29 '25 21:10 benfred

@benfred @mythrocks I fixed the test to address the issue

I also addressed the issue with "too large rows" that won't fit in the buffer, falling back to a strategy that does direct copies (no buffering) in those cases (one for device -> host, the other for host -> device).

ldematte avatar Nov 06 '25 15:11 ldematte

@mythrocks I have rebased this one onto release/25.12 too

ldematte avatar Nov 18 '25 08:11 ldematte

/ok to test f16d295

mythrocks avatar Nov 21 '25 16:11 mythrocks

CI failures are a result of https://github.com/rapidsai/raft/pull/2813. Should be resolved by https://github.com/rapidsai/raft/pull/2881.

mythrocks avatar Nov 21 '25 19:11 mythrocks

/ok to test f16d295

mythrocks avatar Nov 24 '25 17:11 mythrocks

/ok to test faa1608

mythrocks avatar Nov 24 '25 17:11 mythrocks

/merge

mythrocks avatar Nov 25 '25 01:11 mythrocks