glidesort
glidesort copied to clipboard
Quadratic-time run scanning
The minimum run length commit seems to introduce quadratic behavior for runs somewhat shorter than sqrt(n / 2)
, because the run is repeatedly followed and discarded. If so, this would cause worst-case performance of O(n^(3/2)) by multiplying O(n) time to get past each run by O(sqrt(n)) runs that fit in a length-n array. I took the following timings on a length 1e8 array to confirm that this has a practical impact; the input data is just 0, 1, ... r-1 repeated, for run length r. sqrt(1e8 / 2)
is about 7071; strangely, performance improves gradually from about 6000 to that number instead of sharply as I'd expected. The "% create" here is a loose estimate from perf top of fraction of time spent in LogicalRun<B,T>::create
, and "Time create" is that multiplied by total time.
Run | Time (s) | % create | Time create |
---|---|---|---|
500 | 2.44 | 0.20 | 0.49 |
1000 | 2.96 | 0.25 | 0.74 |
2000 | 3.74 | 0.38 | 1.42 |
4000 | 4.48 | 0.45 | 2.02 |
Ah, I forgot to change the skipping logic in that commit.
If I reject the run I should still skip all those elements, instead of just SMALL_SORT
elements. Will fix soon, thanks for reporting.
Maybe it would be better to do this filtering in logical_merge
? That is, if one of the halves of an unsorted/not-unsorted pair has length less than the threshold, it should be treated as unsorted. To match current behavior better you'd also avoid doing a physical merge if the result will be shorter than the threshold, but I'd think that behavior is actually unwanted, or at least the threshold is too high.
Hmm, yes, actually I think that's better in general.
It seems the patch did not make it into master yet. Any plans to do so? Thanks!