compute-shader-101 icon indicating copy to clipboard operation
compute-shader-101 copied to clipboard

Sort experiment

Open raphlinus opened this issue 1 year ago • 0 comments

This branch contains an experiment in sorting. It is not intended to be merged, but having a draft PR gives the branch a stable identifier.

The tip contains an implementation mostly adapted from FidelityFX sort, but with a version of warp-local multi-split inspired by Onesweep. In all cases, subgroup operations have been replaced by workgroup shared memory. There are numerous checkpoints, including a mostly-working version without the WLMS and closer to the original FidelityFX. Note, however, that this exhibits failures consistent with a missing barrier. The tip appears to pass correctness tests, but none of this has been carefully validated.

Sort throughput is approximately 1G element/s on M1 Max.

raphlinus avatar Jan 20 '24 01:01 raphlinus