datafusion
datafusion copied to clipboard
Stop copying LogicalPlan and Exprs in `OptimizeProjections` (2% faster planning)
Note this also has the changes from https://github.com/apache/datafusion/pull/10410 in it
Which issue does this PR close?
Closes https://github.com/apache/datafusion/issues/10209
Rationale for this change
Make planning faster by not copying as much
What changes are included in this PR?
- Rewrite
OptimizeProjectionsto use treenode APIs
Are these changes tested?
Existing tests
Are there any user-facing changes?
- More types of projections can be combined (e.g.
CASEexpressions) - Faster planning
Benchmark Results show a moderate improvement (2% overall in tpch, but some queries like Q3 are like 10% faster)
Details
++ critcmp main projection_pushdown
group main projection_pushdown
----- ---- -------------------
logical_aggregate_with_join 1.01 1217.9±11.93µs ? ?/sec 1.00 1208.1±12.12µs ? ?/sec
logical_plan_tpcds_all 1.00 160.7±1.72ms ? ?/sec 1.00 160.6±1.78ms ? ?/sec
logical_plan_tpch_all 1.02 17.1±0.20ms ? ?/sec 1.00 16.8±0.21ms ? ?/sec
logical_select_all_from_1000 1.00 18.7±0.14ms ? ?/sec 1.01 18.9±0.11ms ? ?/sec
logical_select_one_from_700 1.01 820.9±9.73µs ? ?/sec 1.00 816.1±20.75µs ? ?/sec
logical_trivial_join_high_numbered_columns 1.00 765.5±18.71µs ? ?/sec 1.00 764.8±64.26µs ? ?/sec
logical_trivial_join_low_numbered_columns 1.01 749.8±8.67µs ? ?/sec 1.00 742.3±7.49µs ? ?/sec
physical_plan_tpcds_all 1.03 1364.1±11.01ms ? ?/sec 1.00 1330.1±16.93ms ? ?/sec
physical_plan_tpch_all 1.03 94.8±1.43ms ? ?/sec 1.00 92.1±1.28ms ? ?/sec
physical_plan_tpch_q1 1.06 5.2±0.05ms ? ?/sec 1.00 4.9±0.07ms ? ?/sec
physical_plan_tpch_q10 1.03 4.4±0.06ms ? ?/sec 1.00 4.3±0.08ms ? ?/sec
physical_plan_tpch_q11 1.02 3.9±0.06ms ? ?/sec 1.00 3.8±0.08ms ? ?/sec
physical_plan_tpch_q12 1.00 3.0±0.04ms ? ?/sec 1.02 3.1±0.06ms ? ?/sec
physical_plan_tpch_q13 1.03 2.1±0.04ms ? ?/sec 1.00 2.1±0.03ms ? ?/sec
physical_plan_tpch_q14 1.08 2.9±0.07ms ? ?/sec 1.00 2.7±0.04ms ? ?/sec
physical_plan_tpch_q16 1.06 3.8±0.06ms ? ?/sec 1.00 3.6±0.05ms ? ?/sec
physical_plan_tpch_q17 1.04 3.6±0.07ms ? ?/sec 1.00 3.5±0.07ms ? ?/sec
physical_plan_tpch_q18 1.07 4.1±0.05ms ? ?/sec 1.00 3.9±0.05ms ? ?/sec
physical_plan_tpch_q19 1.10 6.5±0.07ms ? ?/sec 1.00 5.9±0.07ms ? ?/sec
physical_plan_tpch_q2 1.04 8.0±0.04ms ? ?/sec 1.00 7.7±0.09ms ? ?/sec
physical_plan_tpch_q20 1.04 4.8±0.06ms ? ?/sec 1.00 4.6±0.06ms ? ?/sec
physical_plan_tpch_q21 1.02 6.4±0.08ms ? ?/sec 1.00 6.2±0.06ms ? ?/sec
physical_plan_tpch_q22 1.00 3.4±0.06ms ? ?/sec 1.00 3.4±0.08ms ? ?/sec
physical_plan_tpch_q3 1.11 3.4±0.07ms ? ?/sec 1.00 3.1±0.06ms ? ?/sec
physical_plan_tpch_q4 1.12 2.5±0.04ms ? ?/sec 1.00 2.2±0.02ms ? ?/sec
physical_plan_tpch_q5 1.09 4.8±0.05ms ? ?/sec 1.00 4.4±0.06ms ? ?/sec
physical_plan_tpch_q6 1.05 1615.2±30.14µs ? ?/sec 1.00 1543.2±24.24µs ? ?/sec
physical_plan_tpch_q7 1.03 5.9±0.06ms ? ?/sec 1.00 5.7±0.09ms ? ?/sec
physical_plan_tpch_q8 1.02 7.5±0.15ms ? ?/sec 1.00 7.4±0.08ms ? ?/sec
physical_plan_tpch_q9 1.01 5.6±0.07ms ? ?/sec 1.00 5.5±0.11ms ? ?/sec
physical_select_all_from_1000 1.01 61.5±0.34ms ? ?/sec 1.00 61.2±0.52ms ? ?/sec
physical_select_one_from_700 1.02 3.7±0.05ms ? ?/sec 1.00 3.6±0.04ms ? ?/sec