Igor van den Hoven
Igor van den Hoven
1. More sophisticated benchmarks are probably needed. 2. Adding random pivots might prevent a beneficial switch to quadsort. 3. I haven't been able to improve performance with a custom call...
As far as I can tell the flux_max function isn't stable? Moving the maximum element to the end might be possible to accomplish in n comparisons and ~ log n...
On second thought, I think the bubble sort would be the right way to go. n comps and n swaps.
I reckon there will be a variety of solutions. I'll try to wrap my head around this partitioning logic in the next few days and get back to you.
3x3 is as good at avoiding the worst case partition as 1x7 with fewer comparisons. 1x7 requires 6+5+4+3+2+1 = 21 comparisons worst case while 3x3 is at 3+3+3+3 = 12....
It's possible to perform a branchless run analysis at a ~3% performance cost for random, useful for detecting pipe organ distributions.
I'm getting good performance with a branchless bubble sort. I typically use `./a.out 10000 100 1 1` for accurate readings, though I had to drop down to `./a.out 1000 100...
I took a closer look at the actual partitioning logic, and I think the analyzer would take care of most cases where flux_max would be beneficial? Another concern is that...
This seems to work to deal with the cnt * cnt % 10000 distribution for large arrays. Perhaps a bit heavy-handed. It's still not quite on par with pdqsort for...
That lines up pretty closely with my own pivot choices based on benchmarks. I might be able to write a dynamic pivot selection routine.