Feature: Update sort algorithm using Loser Tree
Summary
Currently Databend's sort algorithm using MergSort based on heap. The main algorithms of multiple merging include heap sort, winner tree and loser tree. In these three algorithms, each heap adjustment of heap sorting requires comparison with the left and right child nodes, and the comparison times is 2logN, while the comparison times of winner tree and loser tree adjustment are logN. The difference is that the winner tree needs to compare with sibling nodes and update the parent node, while the loser tree only needs to compare with the parent node, and the memory access times are less.
I'm genuinely interested in this task, and if it's possible, I would be honored to take on the responsibility of completing it.
I wrote a loser_tree implementation, but I can't integrate it into src/query/pipeline/transforms/src/processors/transforms/sort/merger.rs.
https://github.com/forsaken628/databend/blob/c14c3085b7ba2e3841eeef0af47ab2935902463b/src/query/pipeline/transforms/src/processors/transforms/sort/loser_tree.rs
- Changing the length (push, pop) requires rebuilding the entire tree, so the current implementation doesn't support push, and pop just flags it.
- peek_top2 will be slower than find_bigger_child_of_root.
cc: @RinChanNOWWW
IMO, you can add a new trait named Merger(or other names like MergeAlgorithm) and let Heap and LoserTree implement the trait. The methods of the trait can be the public methods of the current HeapMerger. I think this is a more easy way to introduce your lose tree to the sort operator.