hdbscan-cpp
hdbscan-cpp copied to clipboard
QuickSort implementation on undirected graph
Hello everybody,
Can someone, please, explain to me a small quicksort implementation detail in the UndirectedGraph class.
In the method selectPivotIndex there is an early return block
if (startIndex - endIndex <= 1)
return startIndex;
It seems that this condition is always true. That's totally fine for the quicksort implementation as the pivot could be any element in between startIndex and endIndex, though the heuristic after this block would be obsolete. Is it correct?
Ah now I get it, do you see a case where it is failing?
Haha, it took some time from me to remember the context 😅
I believe instead of looking at this equation as
if (startIndex - endIndex <= 1)
return startIndex;
it is better to take a look at it as
if (startIndex <= endIndex + 1)
return startIndex;
which seems to be always or almost always true.
I have not measured the real performance, but it is known that quick sort degrades in performance if deals with unbalanced partition splits.
For this implementation, it seems, that if the array is sorted in reverse order then there is a linear amount of swaps (because of the startIndex choice) on each step of the recursion. In that case, the performance will be O(n^2).