cudf
cudf copied to clipboard
[FEA] Update to CCCL 2.3 or 2.4
Currently RAPIDS is built with CCCL 2.2. This issue lists tasks that we can follow up on once we have upgraded to CCCL 2.3 or 2.4. (We haven't decided on an exact timeline for updating, so RAPIDS could target CCCL 2.3 or 2.4 depending on that timing.)
CCCL 2.3
- If we upgrade to 2.3, we will need a patch for https://github.com/NVIDIA/cccl/pull/1499 which will be fixed in 2.4.
- CCCL 2.3 performance may be a motivating factor:
- Up to 60% performance improvements of
cub::DeviceSelect::UniqueByKey,cub::DeviceScan::ExclusiveSumByKey, andcub::DeviceReduce::ReduceByKeyon A100.cub::DeviceSegmentedReducenow supports 64-bit indexing.
- Up to 60% performance improvements of
- Replace device uses of
thrust::optionalwithcuda::std::optional- https://github.com/rapidsai/cudf/pull/15091#issuecomment-2004286213
CCCL 2.4
- See notes on patch above
Additional Context
- https://github.com/NVIDIA/cccl/releases
- Test PR: #14704
cc: @miscco @jrhemstad @robertmaynard
CUDA 12.4 ships with CCCL 2.3.
The CCCL support matrix states that CCCL 2.2 is officially untested/unsupported when using 12.4+, so moving to a newer version will be needed.
CCCL 2.3 also requires changes to projects due to thrust::tuple constructor changes.
We can also tackle https://github.com/NVIDIA/cuCollections/issues/469 after this migration is complete.
Are we now considering going straight to 12.5? See https://github.com/rapidsai/rapids-cmake/pull/607 (CC @trxcllnt).
RAPIDS has been updated to CCCL 2.5 (a pre-release commit for now, and eventually the 2.5.0 release).