datafusion icon indicating copy to clipboard operation
datafusion copied to clipboard

Better CSE identifier

Open peter-toth opened this issue 1 year ago • 2 comments

This is a draft PR that implements the ideas from https://github.com/apache/datafusion/issues/10426#issuecomment-2105664520.

Which issue does this PR close?

Closes https://github.com/apache/datafusion/issues/10426.

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

peter-toth avatar May 12 '24 18:05 peter-toth

I think @peter-toth plans to break this PR up into smaller ones, so marking it as a draft to make it clear it isn't waiting on more feedback. If I am mistaken, please let me know

alamb avatar May 15 '24 19:05 alamb

I think @peter-toth plans to break this PR up into smaller ones, so marking it as a draft to make it clear it isn't waiting on more feedback. If I am mistaken, please let me know

Yes, here is the first part that adds the new TreeNode APIs: https://github.com/apache/datafusion/pull/10543

peter-toth avatar May 16 '24 10:05 peter-toth

@alamb, can you please help me with that MSRV failure? I don't know what could be the source of the failure and if I try running cargo msrv verify locally it fails with:

% cargo msrv verify
Fetching index
Unable to find key 'package.rust-version' (or 'package.metadata.msrv') in '/Users/ptoth/git/apache/arrow-datafusion/Cargo.toml'

peter-toth avatar Jun 21 '24 09:06 peter-toth

This PR is more or less ready for review, tests are passing except for the MSRV. I focused only on the 3 performance improvements and deliberately kept the code as close to the original as possible to ease review. I plan to open a follow-up PR to tackle remaining TODOs and refactor CSE code a bit for better readability.

Local benchmarks show good perfroamance improvements, but @alamb please confirm it with your standard setup.

% critcmp main better-cse-identifier
group                                         better-cse-identifier                   main
-----                                         ---------------------                   ----
logical_aggregate_with_join                   1.00    529.3±8.70µs        ? ?/sec     1.00    528.4±6.17µs        ? ?/sec
logical_plan_tpcds_all                        1.00     71.3±2.74ms        ? ?/sec     1.06    75.4±16.47ms        ? ?/sec
logical_plan_tpch_all                         1.00      7.3±0.19ms        ? ?/sec     1.09      7.9±2.34ms        ? ?/sec
logical_select_all_from_1000                  1.02     15.9±2.00ms        ? ?/sec     1.00     15.6±1.27ms        ? ?/sec
logical_select_one_from_700                   1.00    399.9±3.73µs        ? ?/sec     1.02    408.3±4.29µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   422.7±92.70µs        ? ?/sec     1.00   423.7±88.41µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00    380.5±5.11µs        ? ?/sec     1.01    385.3±4.22µs        ? ?/sec
physical_plan_tpcds_all                       1.00   536.8±11.44ms        ? ?/sec     1.09   586.2±20.88ms        ? ?/sec
physical_plan_tpch_all                        1.01    38.5±12.88ms        ? ?/sec     1.00     38.1±0.59ms        ? ?/sec
physical_plan_tpch_q1                         1.00  1108.3±12.50µs        ? ?/sec     1.43  1585.0±12.90µs        ? ?/sec
physical_plan_tpch_q10                        1.00  1598.6±219.13µs        ? ?/sec    1.15  1842.3±280.13µs        ? ?/sec
physical_plan_tpch_q11                        1.00  1339.7±14.58µs        ? ?/sec     1.15  1540.2±247.46µs        ? ?/sec
physical_plan_tpch_q12                        1.00  1117.3±63.22µs        ? ?/sec     1.12  1252.1±17.87µs        ? ?/sec
physical_plan_tpch_q13                        1.00   769.6±19.54µs        ? ?/sec     1.12   864.0±12.96µs        ? ?/sec
physical_plan_tpch_q14                        1.00  930.6±216.63µs        ? ?/sec     1.16  1081.1±125.78µs        ? ?/sec
physical_plan_tpch_q16                        1.00  1331.4±38.36µs        ? ?/sec     1.15  1534.8±261.21µs        ? ?/sec
physical_plan_tpch_q17                        1.00  1209.8±29.88µs        ? ?/sec     1.09  1322.0±17.01µs        ? ?/sec
physical_plan_tpch_q18                        1.00  1532.7±240.18µs        ? ?/sec    1.04  1593.0±20.68µs        ? ?/sec
physical_plan_tpch_q19                        1.00      2.5±0.08ms        ? ?/sec     1.06      2.7±0.04ms        ? ?/sec
physical_plan_tpch_q2                         1.00      3.2±0.70ms        ? ?/sec     1.06      3.4±0.72ms        ? ?/sec
physical_plan_tpch_q20                        1.00  1642.3±16.61µs        ? ?/sec     1.12  1845.0±302.12µs        ? ?/sec
physical_plan_tpch_q21                        1.00      2.6±0.65ms        ? ?/sec     1.07      2.7±0.64ms        ? ?/sec
physical_plan_tpch_q22                        1.00  1196.6±101.79µs        ? ?/sec    1.08  1292.8±15.01µs        ? ?/sec
physical_plan_tpch_q3                         1.00  1105.8±28.61µs        ? ?/sec     1.23  1358.9±236.19µs        ? ?/sec
physical_plan_tpch_q4                         1.00   852.5±13.28µs        ? ?/sec     1.12  957.4±161.05µs        ? ?/sec
physical_plan_tpch_q5                         1.00  1685.6±23.24µs        ? ?/sec     1.10  1857.1±21.46µs        ? ?/sec
physical_plan_tpch_q6                         1.00  642.0±188.39µs        ? ?/sec     1.01   647.4±12.04µs        ? ?/sec
physical_plan_tpch_q7                         1.00      2.2±0.45ms        ? ?/sec     1.08      2.4±0.54ms        ? ?/sec
physical_plan_tpch_q8                         1.00      2.7±0.05ms        ? ?/sec     1.08      2.9±0.06ms        ? ?/sec
physical_plan_tpch_q9                         1.00  1964.3±55.93µs        ? ?/sec     1.08      2.1±0.04ms        ? ?/sec

peter-toth avatar Jun 21 '24 15:06 peter-toth

I am starting the benchmark run now -- I'll report back here and give this PR a look as soon as possible (but maybe not until tomorrow)

Thank you so much @peter-toth

alamb avatar Jun 21 '24 17:06 alamb

My results appear consistent with yours @peter-toth -- looks like maybe the high column count case is slightly worse -- I can maybe profile it to see if I can find any reason that might be

++ critcmp main better-cse-identifier
group                                         better-cse-identifier                  main
-----                                         ---------------------                  ----
logical_aggregate_with_join                   1.01  1023.5±41.82µs        ? ?/sec    1.00  1008.6±10.43µs        ? ?/sec
logical_plan_tpcds_all                        1.00    152.4±1.06ms        ? ?/sec    1.00    152.2±1.00ms        ? ?/sec
logical_plan_tpch_all                         1.00     16.9±0.19ms        ? ?/sec    1.01     17.1±0.26ms        ? ?/sec
logical_select_all_from_1000                  1.02     18.9±0.12ms        ? ?/sec    1.00     18.6±1.74ms        ? ?/sec
logical_select_one_from_700                   1.01    829.8±9.09µs        ? ?/sec    1.00   823.9±10.68µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   774.1±10.87µs        ? ?/sec    1.00   770.5±27.71µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00   765.0±32.06µs        ? ?/sec    1.00    763.9±9.85µs        ? ?/sec
physical_plan_tpcds_all                       1.00  1073.0±12.72ms        ? ?/sec    1.08   1159.6±5.63ms        ? ?/sec
physical_plan_tpch_all                        1.00     71.7±0.67ms        ? ?/sec    1.10     79.0±0.88ms        ? ?/sec
physical_plan_tpch_q1                         1.00      2.5±0.04ms        ? ?/sec    1.36      3.5±0.02ms        ? ?/sec
physical_plan_tpch_q10                        1.00      3.6±0.03ms        ? ?/sec    1.12      4.0±0.03ms        ? ?/sec
physical_plan_tpch_q11                        1.00      3.1±0.03ms        ? ?/sec    1.09      3.4±0.03ms        ? ?/sec
physical_plan_tpch_q12                        1.00      2.4±0.01ms        ? ?/sec    1.12      2.7±0.02ms        ? ?/sec
physical_plan_tpch_q13                        1.00  1805.2±95.39µs        ? ?/sec    1.11      2.0±0.01ms        ? ?/sec
physical_plan_tpch_q14                        1.00      2.0±0.03ms        ? ?/sec    1.18      2.4±0.02ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.0±0.03ms        ? ?/sec    1.11      3.3±0.03ms        ? ?/sec
physical_plan_tpch_q17                        1.00      2.8±0.02ms        ? ?/sec    1.09      3.0±0.03ms        ? ?/sec
physical_plan_tpch_q18                        1.00      3.2±0.03ms        ? ?/sec    1.09      3.5±0.02ms        ? ?/sec
physical_plan_tpch_q19                        1.00      4.9±0.06ms        ? ?/sec    1.05      5.2±0.05ms        ? ?/sec
physical_plan_tpch_q2                         1.00      6.3±0.04ms        ? ?/sec    1.05      6.6±0.05ms        ? ?/sec
physical_plan_tpch_q20                        1.00      3.7±0.04ms        ? ?/sec    1.07      3.9±0.04ms        ? ?/sec
physical_plan_tpch_q21                        1.00      5.1±0.06ms        ? ?/sec    1.06      5.4±0.03ms        ? ?/sec
physical_plan_tpch_q22                        1.00      2.7±0.02ms        ? ?/sec    1.10      3.0±0.03ms        ? ?/sec
physical_plan_tpch_q3                         1.00      2.6±0.03ms        ? ?/sec    1.16      2.9±0.02ms        ? ?/sec
physical_plan_tpch_q4                         1.00  1956.8±10.71µs        ? ?/sec    1.09      2.1±0.02ms        ? ?/sec
physical_plan_tpch_q5                         1.00      3.7±0.04ms        ? ?/sec    1.09      4.1±0.02ms        ? ?/sec
physical_plan_tpch_q6                         1.00   1316.0±9.67µs        ? ?/sec    1.10  1452.2±16.94µs        ? ?/sec
physical_plan_tpch_q7                         1.00      4.7±0.03ms        ? ?/sec    1.04      4.8±0.03ms        ? ?/sec
physical_plan_tpch_q8                         1.00      5.8±0.04ms        ? ?/sec    1.07      6.2±0.04ms        ? ?/sec
physical_plan_tpch_q9                         1.00      4.4±0.03ms        ? ?/sec    1.07      4.7±0.03ms        ? ?/sec
physical_select_all_from_1000                 1.04     46.8±0.32ms        ? ?/sec    1.00     45.1±0.20ms        ? ?/sec
physical_select_one_from_700                  1.03      3.5±0.02ms        ? ?/sec    1.00      3.4±0.02ms        ? ?/sec

alamb avatar Jun 21 '24 18:06 alamb

@alamb, can you please help me with that MSRV failure? I don't know what could be the source of the failure and if I try running cargo msrv verify locally it fails with:

% cargo msrv verify
Fetching index
Unable to find key 'package.rust-version' (or 'package.metadata.msrv') in '/Users/ptoth/git/apache/arrow-datafusion/Cargo.toml'

I've bumped MSRV to 1.76 in https://github.com/apache/datafusion/pull/10473/commits/ccc92b9ab00813a69583abe2ec4ab5517107be35. It we want to avoid that then let me know and I will try to find out what is not available in 1.75.

peter-toth avatar Jun 22 '24 09:06 peter-toth

I ran the benchmarks again:

Looks to me like this PR makes planning 10% faster

physical_plan_tpcds_all                       1.00   1074.8±5.94ms        ? ?/sec    1.10   1181.1±6.54ms        ? ?/sec
physical_plan_tpch_all                        1.00     72.2±0.77ms        ? ?/sec    1.11     79.8±0.67ms        ? ?/sec

🚀

++ critcmp main better-cse-identifier
group                                         better-cse-identifier                  main
-----                                         ---------------------                  ----
logical_aggregate_with_join                   1.00  1038.1±44.37µs        ? ?/sec    1.00  1040.8±49.37µs        ? ?/sec
logical_plan_tpcds_all                        1.00    153.0±1.07ms        ? ?/sec    1.01    154.3±0.98ms        ? ?/sec
logical_plan_tpch_all                         1.00     17.1±0.20ms        ? ?/sec    1.01     17.2±0.20ms        ? ?/sec
logical_select_all_from_1000                  1.00     18.9±0.09ms        ? ?/sec    1.01     19.0±0.11ms        ? ?/sec
logical_select_one_from_700                   1.01   840.6±10.15µs        ? ?/sec    1.00    836.0±9.66µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.00   789.5±33.87µs        ? ?/sec    1.00   791.5±15.53µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.00   771.3±21.54µs        ? ?/sec    1.02   785.4±21.89µs        ? ?/sec
physical_plan_tpcds_all                       1.00   1074.8±5.94ms        ? ?/sec    1.10   1181.1±6.54ms        ? ?/sec
physical_plan_tpch_all                        1.00     72.2±0.77ms        ? ?/sec    1.11     79.8±0.67ms        ? ?/sec
physical_plan_tpch_q1                         1.00      2.6±0.02ms        ? ?/sec    1.38      3.5±0.03ms        ? ?/sec
physical_plan_tpch_q10                        1.00      3.6±0.03ms        ? ?/sec    1.14      4.1±0.06ms        ? ?/sec
physical_plan_tpch_q11                        1.00      3.1±0.03ms        ? ?/sec    1.11      3.4±0.04ms        ? ?/sec
physical_plan_tpch_q12                        1.00      2.4±0.02ms        ? ?/sec    1.13      2.8±0.02ms        ? ?/sec
physical_plan_tpch_q13                        1.00  1814.3±19.22µs        ? ?/sec    1.12      2.0±0.02ms        ? ?/sec
physical_plan_tpch_q14                        1.00      2.0±0.02ms        ? ?/sec    1.18      2.4±0.02ms        ? ?/sec
physical_plan_tpch_q16                        1.00      3.0±0.03ms        ? ?/sec    1.12      3.4±0.03ms        ? ?/sec
physical_plan_tpch_q17                        1.00      2.8±0.03ms        ? ?/sec    1.10      3.1±0.03ms        ? ?/sec
physical_plan_tpch_q18                        1.00      3.3±0.03ms        ? ?/sec    1.11      3.6±0.04ms        ? ?/sec
physical_plan_tpch_q19                        1.00      4.8±0.05ms        ? ?/sec    1.07      5.2±0.05ms        ? ?/sec
physical_plan_tpch_q2                         1.00      6.4±0.15ms        ? ?/sec    1.05      6.7±0.05ms        ? ?/sec
physical_plan_tpch_q20                        1.00      3.6±0.03ms        ? ?/sec    1.08      3.9±0.03ms        ? ?/sec
physical_plan_tpch_q21                        1.00      5.1±0.07ms        ? ?/sec    1.08      5.4±0.04ms        ? ?/sec
physical_plan_tpch_q22                        1.00      2.7±0.03ms        ? ?/sec    1.10      3.0±0.02ms        ? ?/sec
physical_plan_tpch_q3                         1.00      2.6±0.02ms        ? ?/sec    1.16      3.0±0.02ms        ? ?/sec
physical_plan_tpch_q4                         1.00  1974.3±23.01µs        ? ?/sec    1.09      2.1±0.03ms        ? ?/sec
physical_plan_tpch_q5                         1.00      3.7±0.04ms        ? ?/sec    1.10      4.1±0.06ms        ? ?/sec
physical_plan_tpch_q6                         1.00  1321.5±35.02µs        ? ?/sec    1.11  1472.4±20.45µs        ? ?/sec
physical_plan_tpch_q7                         1.00      4.7±0.04ms        ? ?/sec    1.05      4.9±0.05ms        ? ?/sec
physical_plan_tpch_q8                         1.00      5.9±0.08ms        ? ?/sec    1.09      6.4±0.06ms        ? ?/sec
physical_plan_tpch_q9                         1.00      4.4±0.05ms        ? ?/sec    1.08      4.8±0.04ms        ? ?/sec
physical_select_all_from_1000                 1.00     46.7±0.40ms        ? ?/sec    1.02     47.7±0.29ms        ? ?/sec
physical_select_one_from_700                  1.00      3.5±0.02ms        ? ?/sec    1.01      3.5±0.03ms        ? ?/sec

alamb avatar Jun 23 '24 12:06 alamb

🚀 thanks again @peter-toth

alamb avatar Jun 24 '24 20:06 alamb