hdbscan-cpp icon indicating copy to clipboard operation
hdbscan-cpp copied to clipboard

QuickSort implementation on undirected graph

Open st235 opened this issue 1 year ago • 2 comments
trafficstars

Hello everybody,

Can someone, please, explain to me a small quicksort implementation detail in the UndirectedGraph class.

In the method selectPivotIndex there is an early return block

	if (startIndex - endIndex <= 1)
		return startIndex;

It seems that this condition is always true. That's totally fine for the quicksort implementation as the pivot could be any element in between startIndex and endIndex, though the heuristic after this block would be obsolete. Is it correct?

st235 avatar Mar 02 '24 12:03 st235

Ah now I get it, do you see a case where it is failing?

rohanmohapatra avatar Sep 03 '24 04:09 rohanmohapatra

Haha, it took some time from me to remember the context 😅

I believe instead of looking at this equation as

	if (startIndex - endIndex <= 1)
		return startIndex;

it is better to take a look at it as

	if (startIndex <= endIndex + 1)
		return startIndex;

which seems to be always or almost always true.

I have not measured the real performance, but it is known that quick sort degrades in performance if deals with unbalanced partition splits.

For this implementation, it seems, that if the array is sorted in reverse order then there is a linear amount of swaps (because of the startIndex choice) on each step of the recursion. In that case, the performance will be O(n^2).

st235 avatar Sep 09 '24 21:09 st235