Bradley Dice

Results 608 comments of Bradley Dice

`CCCL 3.1.0 + RAPIDS 25.10` CI is working, and RAPIDS is nearly ready to adopt the new CCCL version in 25.12. This will complete task **(1)** above.

@wence- -- thank you, I've been meaning to voice this exact concern and didn't know how to start the discussion. I think we need close parity between C++ (CCCL) and...

@PointKernel Let's discuss this -- I'm targeting #11656 for 23.06 and it also affects downstream work like #13244. I am surprised that the new cuco work would eliminate the need...

We’ve had some trouble deciding on the appropriate scope for this internally. Perhaps we should do the least intrusive thing and only change device_scalar as @vuule is suggesting. I recognize...

@JigaoLuo The conversation has mostly been around what we really want -- and trying to keep from asking you to go back and forth on designs (should this go in...

@wence- @JigaoLuo Thanks, using `cudaMemcpyAsync` for stream-ordered host copies is a good solution for my concern about double-writes not being stream ordered. I talked about this with @vuule and that...

I think this might be simple, and may not require summing components. I think we can do a conversion/cast to `duration_s` and then cast that as a float type to...

@vyasr I feel like RMM should probably try to enable something like this for CUDA >=12.6,

@wence- Thanks for this issue. Would you be able to work on a fix?