cudf [FEA] Update to CCCL 2.3 or 2.4

Currently RAPIDS is built with CCCL 2.2. This issue lists tasks that we can follow up on once we have upgraded to CCCL 2.3 or 2.4. (We haven't decided on an exact timeline for updating, so RAPIDS could target CCCL 2.3 or 2.4 depending on that timing.)

CCCL 2.3

If we upgrade to 2.3, we will need a patch for https://github.com/NVIDIA/cccl/pull/1499 which will be fixed in 2.4.
CCCL 2.3 performance may be a motivating factor:
- Up to 60% performance improvements of cub::DeviceSelect::UniqueByKey, cub::DeviceScan::ExclusiveSumByKey, and cub::DeviceReduce::ReduceByKey on A100. cub::DeviceSegmentedReduce now supports 64-bit indexing.
Replace device uses of thrust::optional with cuda::std::optional
- https://github.com/rapidsai/cudf/pull/15091#issuecomment-2004286213

CCCL 2.4

See notes on patch above

Additional Context

https://github.com/NVIDIA/cccl/releases
Test PR: #14704

cc: @miscco @jrhemstad @robertmaynard

Mar 18 '24 15:03 bdice

CUDA 12.4 ships with CCCL 2.3.

The CCCL support matrix states that CCCL 2.2 is officially untested/unsupported when using 12.4+, so moving to a newer version will be needed.

Apr 02 '24 20:04 robertmaynard

CCCL 2.3 also requires changes to projects due to thrust::tuple constructor changes.

Apr 16 '24 13:04 robertmaynard

We can also tackle https://github.com/NVIDIA/cuCollections/issues/469 after this migration is complete.

May 02 '24 18:05 bdice

Are we now considering going straight to 12.5? See https://github.com/rapidsai/rapids-cmake/pull/607 (CC @trxcllnt).

May 15 '24 19:05 vyasr

RAPIDS has been updated to CCCL 2.5 (a pre-release commit for now, and eventually the 2.5.0 release).

Jun 11 '24 22:06 bdice

cudf cudf copied to clipboard

[FEA] Update to CCCL 2.3 or 2.4

CCCL 2.3

CCCL 2.4

Additional Context

cudf
cudf copied to clipboard