datafusion Cache output `equivalence_properties` in ProjectionExec

Which issue does this PR close?

Closes #9084.

Rationale for this change

Avoid potential exponential explosions in branching calls to output_partitioning and equivalence_properties in certain plan combinations.

What changes are included in this PR?

Simply store the equivalence_properties, which are always calculated anyway in ProjectionExec.

Are these changes tested?

They were tested on the example from https://github.com/apache/arrow-datafusion/issues/9084

Are there any user-facing changes?

Not really, apart from not stalling TPC-DS q64.

Feb 01 '24 08:02 gruuya

Out of curiousity I ran the sql_planner benchmark on this PR compared to c843226d84f79fa7ce99576ae8605ecef1a28484 (main at the time of writing).

Even though the PR is not targetted at those queries, the improvements seem to be in the range of no-change to -3% :+1:

Results

     Running benches/sql_planner.rs (/home/svs/code/arrow-datafusion/target/release-nonlto/deps/sql_planner-89232c907fea5c25)
Gnuplot not found, using plotters backend
Benchmarking logical_select_one_from_700: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.7s, enable flat sampling, or reduce sample count to 60.
logical_select_one_from_700
                        time:   [1.3148 ms 1.3211 ms 1.3283 ms]
                        change: [-1.2182% -0.5419% +0.1004%] (p = 0.11 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

physical_select_one_from_700
                        time:   [4.5290 ms 4.5467 ms 4.5678 ms]
                        change: [-0.7308% -0.2853% +0.2090%] (p = 0.27 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) high mild
  4 (4.00%) high severe

Benchmarking logical_trivial_join_low_numbered_columns: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, enable flat sampling, or reduce sample count to 60.
logical_trivial_join_low_numbered_columns
                        time:   [1.3374 ms 1.3413 ms 1.3454 ms]
                        change: [-1.0865% -0.5004% +0.0116%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

Benchmarking logical_trivial_join_high_numbered_columns: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, enable flat sampling, or reduce sample count to 50.
logical_trivial_join_high_numbered_columns
                        time:   [1.3660 ms 1.3684 ms 1.3712 ms]
                        change: [-1.5172% -1.0961% -0.6794%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

Benchmarking logical_aggregate_with_join: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.1s, enable flat sampling, or reduce sample count to 50.
logical_aggregate_with_join
                        time:   [1.7815 ms 1.7837 ms 1.7859 ms]
                        change: [-2.1258% -1.5940% -1.1366%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q1: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q1   time:   [1.3576 ms 1.3596 ms 1.3617 ms]
                        change: [-1.0661% -0.4282% +0.3850%] (p = 0.26 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  4 (4.00%) high mild
  6 (6.00%) high severe

Benchmarking physical_plan_tpch_q2: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.1s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q2   time:   [1.3940 ms 1.3958 ms 1.3975 ms]
                        change: [-1.1803% -0.8095% -0.4489%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low severe
  2 (2.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q3: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q3   time:   [1.2505 ms 1.2523 ms 1.2541 ms]
                        change: [-1.5359% -1.0619% -0.6131%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q4: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q4   time:   [1.2593 ms 1.2613 ms 1.2632 ms]
                        change: [-1.2966% -0.9442% -0.6166%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low severe
  8 (8.00%) high mild

Benchmarking physical_plan_tpch_q5: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.6s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q5   time:   [1.2976 ms 1.2992 ms 1.3009 ms]
                        change: [-1.2683% -0.8299% -0.3415%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  4 (4.00%) high severe

Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.7s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q6   time:   [1.1260 ms 1.1282 ms 1.1304 ms]
                        change: [-0.8453% -0.4227% -0.0237%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q7: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.6s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q7   time:   [1.4897 ms 1.4930 ms 1.4976 ms]
                        change: [-1.1966% -0.5377% +0.1408%] (p = 0.13 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  1 (1.00%) low severe
  5 (5.00%) high mild
  6 (6.00%) high severe

Benchmarking physical_plan_tpch_q8: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.7s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q8   time:   [1.5160 ms 1.5185 ms 1.5217 ms]
                        change: [-0.7303% -0.3073% +0.2294%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  4 (4.00%) high severe

Benchmarking physical_plan_tpch_q9: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.1s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q9   time:   [1.4044 ms 1.4062 ms 1.4081 ms]
                        change: [-2.2499% -1.5619% -0.9930%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  2 (2.00%) low mild
  6 (6.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q10: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.8s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q10  time:   [1.3326 ms 1.3346 ms 1.3368 ms]
                        change: [-1.5355% -1.1162% -0.6136%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 16 outliers among 100 measurements (16.00%)
  2 (2.00%) low severe
  1 (1.00%) low mild
  7 (7.00%) high mild
  6 (6.00%) high severe

Benchmarking physical_plan_tpch_q11: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.5s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q11  time:   [1.2844 ms 1.2861 ms 1.2880 ms]
                        change: [-1.0891% -0.7848% -0.4596%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

Benchmarking physical_plan_tpch_q12: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.6s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q12  time:   [1.2914 ms 1.2932 ms 1.2953 ms]
                        change: [-1.3540% -0.8637% -0.3279%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) low mild
  5 (5.00%) high mild
  5 (5.00%) high severe

Benchmarking physical_plan_tpch_q13: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.0s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q13  time:   [1.1891 ms 1.1914 ms 1.1937 ms]
                        change: [-1.2510% -0.8514% -0.3725%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q14: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.2s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q14  time:   [1.2280 ms 1.2300 ms 1.2322 ms]
                        change: [-0.8543% -0.3613% +0.2262%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 11 outliers among 100 measurements (11.00%)
  1 (1.00%) low mild
  6 (6.00%) high mild
  4 (4.00%) high severe

Benchmarking physical_plan_tpch_q16: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.3s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q16  time:   [1.2386 ms 1.2406 ms 1.2427 ms]
                        change: [-1.6075% -1.0106% -0.4775%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q17: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.1s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q17  time:   [1.2031 ms 1.2054 ms 1.2078 ms]
                        change: [-1.6412% -1.2875% -0.9239%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking physical_plan_tpch_q18: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.5s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q18  time:   [1.2832 ms 1.2851 ms 1.2871 ms]
                        change: [-1.2041% -0.8586% -0.5098%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

Benchmarking physical_plan_tpch_q19: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q19  time:   [1.3537 ms 1.3567 ms 1.3602 ms]
                        change: [-4.1553% -2.8145% -1.6060%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  4 (4.00%) high mild
  8 (8.00%) high severe

Benchmarking physical_plan_tpch_q20: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.6s, enable flat sampling, or reduce sample count to 60.
physical_plan_tpch_q20  time:   [1.2995 ms 1.3014 ms 1.3034 ms]
                        change: [-0.6861% -0.3842% -0.0702%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild

Benchmarking physical_plan_tpch_q21: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.7s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q21  time:   [1.5220 ms 1.5242 ms 1.5265 ms]
                        change: [-4.0173% -3.0381% -2.1649%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low severe
  1 (1.00%) low mild
  4 (4.00%) high mild
  1 (1.00%) high severe

Benchmarking physical_plan_tpch_q22: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.9s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q22  time:   [1.3675 ms 1.3692 ms 1.3711 ms]
                        change: [-1.9314% -1.3749% -0.8150%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_all  time:   [27.848 ms 27.893 ms 27.941 ms]
                        change: [-0.7545% -0.5233% -0.2729%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

Feb 01 '24 13:02 simonvandel

Thanks!

I wonder if we should add a similar optimization for other nodes where computing this is non trivial (like HashJoinExec or HashAggregateExec or FilterExec)

Yup, this makes sense to me. In fact AggregateExec also always computes the eq_props when created and then throws it away (thus forcing re-computing it later on).

I could imagine even changing the signature of ExecutionPlan::equivalence_properties to return a reference and basically force pre-computing the results 🤔

This also sounds good. Arguably for other nodes this might be forcing an eager computation of something that might not be used in theory, but in practice I think these two methods always get invoked during the planning process, at least to propagate the info up the plan tree to something that does cache it now (might also need some profiling/investigating here too).

I can try and make these additional changes separately if other people think it could be a good idea. I also think breaking up DefaultPhysicalPlanner::create_initial_plans LogicalPlan::Join case would be beneficial, since I think that's what's causing the stack overflow for the tpcds_physical_q64 test (which would then also cover the present issue).

Feb 02 '24 07:02 gruuya

@alamb let's wait on merging this. We want to think a little about this and have ideas for a different general solution. We should have something next week.

Feb 02 '24 07:02 ozankabak

@alamb let's wait on merging this. We want to think a little about this and have ideas for a different general solution. We should have something next week.

This is fine with me if it is ok with @gruuya. Also given how small this PR is, if he needs the change sooner maybe we can consider merging it so it is available while the more general solution is underway. I don't feel strongly about this.

@ozankabak / @berkaysynnada is there a ticket somewhere that describes what you are doing ? It might help to make the plans more visible to others as well as I would like to point out we have a version of proejction pushdown in InfluxDB https://github.com/influxdata/influxdb/blob/main/iox_query/src/physical_optimizer/projection_pushdown.rs that might be worth a look as well

Feb 02 '24 10:02 alamb

@ozankabak / @berkaysynnada is there a ticket somewhere that describes what you are doing ?

We will open the PR to the upstream next week. I will also request API-level suggestions there. It's not expected to significantly alter our current plans, but it will be a solid step towards optimizing potential outcomes following other existing optimizations and future ones. Thanks for the IOx solution, I will examine it.

Feb 02 '24 10:02 berkaysynnada

@gruuya, let us know if it is very urgent for you. Then we can go ahead and merge for now and change again in a few days. Otherwise, we can collaborate next week on the more general approach we are working on.

@ozankabak / @berkaysynnada is there a ticket somewhere that describes what you are doing ?

We will open a ticket and a draft PR next week as an RFC for this

Feb 02 '24 10:02 ozankabak

We will open the PR to the upstream next week

Just to be clear, I was suggesting a ticket Before the PR. Some reasons a ticket prior to PR might be be beneficial are:

It may avoid potentially unecessary work like this PR as others would know what is coming
Could provide an area to receive feeback before you code a particular approach up
Would give people a high level idea of what is coming, and thus making reviews faster

Feb 02 '24 10:02 alamb

BTW, for the sake clarity, there are two steps of the work here:

Finding a general solution to expensive calls such as equivalence_properties, whose PR will come next week when @mustafasrepo comes back.
Improving the projection pushdown code (whose issue @berkaysynnada will open shortly and will create a PR for next week)

These two are loosely related but independent tasks

Feb 02 '24 11:02 ozankabak

@alamb, @ozankabak this is not that urgent for us (though it would certainly be nice to have a fix soon).

My impression from https://github.com/apache/arrow-datafusion/issues/9084#issuecomment-1921137878 was that the general solution considered would actually be complementary to the one in this PR (i.e. locally caching stuff in the optimizer as opposed to caching it at the node level), hence possibly not necessitating changing it later on.

Otherwise, we can wait for the more general solution, thanks.

Feb 02 '24 11:02 gruuya

Just to be clear, I was suggesting a ticket Before the PR. Some reasons a ticket prior to PR might be be beneficial are:

I thought it would be easier to discuss through code, but you are right in the importance of filling a ticket. I just mean that this PR has nothing to do with the projection optimizer I'm currently working on.

@alamb let's wait on merging this. We want to think a little about this and have ideas for a different general solution. We should have something next week.

"a different general solution" is not the projection optimizer 😁

I have opened the issue as you suggest: https://github.com/apache/arrow-datafusion/issues/9111.

Feb 02 '24 11:02 berkaysynnada

Converting to draft as we wait for additional potential improvements so it is clear this PR is not waiting on feedback

Feb 02 '24 21:02 alamb

Update: In a couple days, we will open a PR with a solution that takes the core idea of this and generalizes it in a neat way (we think). Stay tuned!

Feb 18 '24 21:02 ozankabak

Superseded by #9346

Feb 29 '24 08:02 gruuya

datafusion datafusion copied to clipboard

Cache output `equivalence_properties` in ProjectionExec

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

datafusion
datafusion copied to clipboard