Effecient KotlinNotebokPluginUtils.sortByColumns
Take 10_000_000 rows, open it in a table widget in notebooks and sort by a column Sorting and loading take a lot of time. I don't have a profile so i can't say for sure where's the actual bottleneck, it needs to be investigated However this method performs sorting of the entire dataframe, all 10 million rows even when only 20 or 100 are going to be displayed. There's more efficient algorithms for such situations, for example least or greatest from Guava: https://guava.dev/releases/snapshot-jre/api/docs/com/google/common/collect/Comparators.html#least(int,java.util.Comparator)
Good idea!
We use a similar algorithm for quick select in our percentile/median/quantile implementation: https://github.com/Kotlin/dataframe/blob/b46524691922c1c49c5258b2f74d7ac8aa817c85/core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/math/quantile.kt#L281 though this only returns a single element.
According to the TopKSelect source their solution uses less memory than quickselect :)
Worth to look at serialization/deserialization too, just sorting 10 million rows shouldn't really take long, so i suspect something else affecting the performance