cccl
cccl copied to clipboard
[FEA]: Add efficient unstable thread sort
Is this a duplicate?
- [X] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct
Area
CUB
Is your feature request related to a problem? Please describe.
The thread sort currently only has a stable method, the odd-even transpose sorting network, which has a complexity O(N^2).
Describe the solution you'd like
Other sorting networks such as Batcher's odd-even mergesort and Parberry's pairwise sort scale much better, with a complexity O(N log^2(N)), so for instance at 16 items/thread they would be 2x faster, at 64 items/thread 4x faster.
The block merge sort has stable and unstable APIs which could select the appropriate thread sort accordingly.
Describe alternatives you've considered
No response
Additional context
No response