cccl icon indicating copy to clipboard operation
cccl copied to clipboard

[FEA]: Add efficient unstable thread sort

Open Nyrio opened this issue 1 year ago • 0 comments

Is this a duplicate?

  • [X] I confirmed there appear to be no duplicate issues for this request and that I agree to the Code of Conduct

Area

CUB

Is your feature request related to a problem? Please describe.

The thread sort currently only has a stable method, the odd-even transpose sorting network, which has a complexity O(N^2).

Describe the solution you'd like

Other sorting networks such as Batcher's odd-even mergesort and Parberry's pairwise sort scale much better, with a complexity O(N log^2(N)), so for instance at 16 items/thread they would be 2x faster, at 64 items/thread 4x faster.

The block merge sort has stable and unstable APIs which could select the appropriate thread sort accordingly.

Describe alternatives you've considered

No response

Additional context

No response

Nyrio avatar Mar 19 '24 20:03 Nyrio