htop icon indicating copy to clipboard operation
htop copied to clipboard

Consolidate functions for sorting vectors

Open BenBE opened this issue 2 months ago • 11 comments

Currently there's Vector_quickSort, Vector_quickSortCustomCompare, and Vector_insertionSort for sorting vectors.

Apply the Highlander rule and let there only be one: Vector_sort.

While at it, update the signatures to allow for context support in the comparison functions:

typedef int (*Object_SortFn)(const void *a, const void *b, void *ctx);
int Vector_sort(Vector *v, Object_SortFn cmp, void *ctx);

Note: The single replacement API MUST be a stable sorting algorithm internally to conserve behaviour of Vector_insertionSort.

BenBE avatar Oct 12 '25 07:10 BenBE

I think Vector_insertionSort algorithm has its merits and I oppose removing it for now. (1) Insertion sort is stable; (2) insertion sort works better for arrays that are 80% sorted.

Explorer09 avatar Oct 12 '25 07:10 Explorer09

There's Timsort and it's derived version Powersort that are a combination of stable Merge sort with basically insertion sort for small sub-arrays.

So no, insertion sort won't be gone entirely, but it will be only a detail in the actual implementation using something more advanced. Thus, I'm fully aware that insertion sort and quick sort have their reason for existence in the current implementation, but it's for a reason, why several languages (Java, Python) switched to something else in their standard library when it comes to sorting.

BenBE avatar Oct 12 '25 20:10 BenBE

@BenBE I was considering leaving two interfaces for sorting, like C++ std::sort and std::stable_sort, and omit the implementation details in the function names.

As for what algorithm will be for stable_sort, my idea was a kind of in-place merge sort that does not take O(n) extra memory (it would take at most O(log(n)) stack space for recursion, though).

Explorer09 avatar Oct 13 '25 01:10 Explorer09

I'm currently looking into PowerSort, which is both stable AND a variant of merge sort. Also, it should be reasonable to implement its merging step in-place, leaving us with static worst case memory usage of O(log(n)).

BenBE avatar Oct 13 '25 05:10 BenBE

I'm currently looking into PowerSort, which is both stable AND a variant of merge sort. Also, it should be reasonable to implement its merging step in-place, leaving us with static worst case memory usage of O(log(n)).

For me personally, any sorting algorithm that needs O(n) space would be out (including naïve merge sorts). It's simply a bad idea to require malloc'ing additional space to sort a big table. In-place merge sorts require slightly more time (O(n×log(n)×log(n)) in worst case) but the advantage of sorting in-place is what interested me. From what I've read on Wikipedia, the primary difference between Timsort and Powersort on the strategies of merging sub-arrays. That can be tuned easily after I try to write the primary in-place merge algorithm first (the in-place merge algorithm would be O(n×log(n)) per merge, comparing to O(n) which is the non-in-place version).

Explorer09 avatar Oct 13 '25 06:10 Explorer09

I'm with you regarding memory usage of O(n), in particular with small arrays this introduces an overhead that's just going to cause performance issues due to how frequent we sort things.

My current WIP can be found at branch vector-sort-replace; but because I'm still working on the implementation of power sort it currently falls back to insertion sort for everything. That's really WIP; but will show the interface I'm aiming for.

NB: Currently looking through a lecture on Powersort (German) by one of the authors.

BenBE avatar Oct 13 '25 06:10 BenBE

Hi! I’m new here and looking to start contributing to open source. If I understand this issue correctly, there are several sorting algorithms mentioned, but the goal is to implement one reliable sorting method.

zeneral avatar Nov 01 '25 14:11 zeneral

Just a reminder: I have an implementation ready for testing. #1798 It's a natural, in-place merge sort that works best with partially sorted data.

Explorer09 avatar Nov 01 '25 15:11 Explorer09

@zeneral Welcome. Yes. The issue is basically to rework the internal way sorting is performed and replace the two different interfaces by a single one. The requirements for the replacement are:

  • Must be a stable sorting algorithm
  • Must be able to take advantages of runs of partially sorted data
  • Should follow the interface I outlined in my branch, to allow comparing different implementations

Currently htop uses both Quick Sort and Insertion Sort. These should be replaced by e.g. a single one like Power Sort (improvement over Tim Sort with guaranteed bounded static memory usage).

Which brings us to another secondary requirement:

  • Should avoid dynamic memory allocation when possible (bounded stack use or in-place is fine).

Be aware that I'm purposefully waiting for several implementations to be available, so different variants can be compared for their runtime behavior.

BenBE avatar Nov 01 '25 20:11 BenBE

I wrote this but don't know how to test it, though.

static void shellSort(Object** array, int left, int right, Object_Compare compare){
   for(int gap = (right + 1) / 2; gap > 0; gap /= 2){
      for(int i = gap; i <= right; i++){
         Object* t = array[i];
         int j = i;
         while(j >= gap && compare(array[j - gap], t) > 0){
            array[j] = array[j - gap];
            j--;
         }
         array[j] = t;
      }
   }
}

void vector_shellShort(Vector* this){
   assert(this->type->compare);
   assert(Vector_isConsistent(this));
   shellSort(this->array, 0, this->items-1, this->type->compare);
   assert(Vector_isConsistent(this));
}

zeneral avatar Nov 04 '25 07:11 zeneral

@zeneral Take a look at d3cd557e0534a45ab971397d4c407116547d3f6d, which proposes the intended API for after the consolidation. Replace the implementation in Vector_sort by your variant and compare with a tool like callgrind¹+kcachegrind. Note that you need debug information enabled in order to see proper function/line info. Another check might be cache behaviour of the implementation, which can be checked with cachegrind.

¹ Something like valgrind --tool=callgrind --dump-instr=yes --dump-line=yes --max-stackframe=64 --realloc-zero-bytes-frees=yes ./htop usually works. Load the resulting file with kcachegrind

BenBE avatar Nov 05 '25 12:11 BenBE