Daniël Heres

Results 167 comments of Daniël Heres

I think @Tmonster runs them once in a while

Thanks @Tmonster we probably make a PR soon for DataFusion 42, as many performance improvements were added compared to 41 and DataFusion will perform better for aggregations.

Is the issue that we can remove tasks/partitions/stages that are empty or is it a bug?

Hi @jackwener . The idea is that any expression within DataFusion receives a name, so the nodes in a LogicalPlan use the `NamedExpr` type. The name in this type is...

In what situations would these changes lead to better performance? I.e. why is query 28 28: ~ 1.10x faster?

It would be worthwhile to run the `clickbench_extended` benchmarks as well (`./bench.sh run clickbench_extended`)

@mingmwang I think for broadcasting exchange the same thing applies as normal exchanges, they are spilled to disk by default and might be maintained in memory if memory budget allows....

That's a good observation @mingmwang ! The difference with CollectLeft is that that mode collects the left side to one partition, whereas with broadcast we would broadcast the output of...