Daniël Heres
Daniël Heres
Hm getting different results now - may have been something running in the background on my machine - will update the results accordingly
Ok - updated the benchmark results - looks good to go now!
Thanks @AssHero @alamb
Also see #279 #280 for extra context and a proposed implementation
> > Also see #279 #280 for extra context and a proposed implementation > > when are we going to merge or review them? @Dandandan I think the PR is...
I think there are two items here * parallelizing `order by` followed by `limit n`: * implement topk operator Currently a plan looks like the following ``` | physical_plan |...
> > Already with the current implementation we should be able to rewrite it to: > > I agree this would help performance. However, it will become irrelevant if/when we...
After playing with it a bit more, I think this issue boils down to pushing a limit to `SortPreservingMergeStream` and using it there to only keep top N items instead...
So it seems we *mostly* have the benefits of a TopK operator now by pushing down the limit to individual operations. There are a couple of followups possible (will create...
I added https://github.com/apache/arrow-datafusion/issues/3431 to the list.