cccl icon indicating copy to clipboard operation
cccl copied to clipboard

Refactor `thrust::reduce` to use `cub::DeviceReduce`

Open elstehle opened this issue 10 months ago • 0 comments

This is a sub-task of Thrust/CUB kernel consolidation https://github.com/NVIDIA/cccl/issues/26

Prepare cub::DeviceReduce for feature parity needed by thrust::reduce:

  • [ ] Introduce vsmem utility to cub::DeviceReduce
  • [ ] Add tests to CUB that check that cub::DeviceReduce correctly uses the fallback policy (see https://github.com/NVIDIA/cccl/pull/1379/commits/fdf565e6cc063103643ea0e964b2437400721c5e)
  • [ ] Add tests to CUB that check that cub::DeviceReduce correctly uses virtual shared memory (see https://github.com/NVIDIA/cccl/pull/1379/commits/fdf565e6cc063103643ea0e964b2437400721c5e)

Refactor thrust::reduce to use cub::DeviceReduce:

  • [ ] Make thrust::reduce use cub::DeviceReduce (see https://github.com/NVIDIA/cccl/pull/1379/commits/948817ed034a1f704433e4c5e13444e0b9a75106)
  • [ ] Add dynamic 32/64-bit offset type-dispatch to thrust::reduce (see 948817e L210-216)
  • [ ] Add sanity tests for large number of items for thrust::reduce (see https://github.com/NVIDIA/cccl/pull/1379/commits/01f32ddb50c5175154b336af83446ab1dfe8b12a)
  • [ ] Add more elaborate testing for cub::DeviceReduce (see https://github.com/NVIDIA/cccl/pull/1612)

elstehle avatar Apr 12 '24 16:04 elstehle