[fix](enforcer) shuffle if has continuous project or filter on cte consumer
What problem does this PR solve?
Related PR: #21412
Problem Summary:
This pull request improves the handling of distribution properties (specifically "must shuffle") for PhysicalProject and PhysicalFilter nodes in the query planner, and adds comprehensive unit tests to ensure correctness. The main logic ensures that when certain child nodes require shuffling, the planner correctly adjusts the distribution requirements, especially in the presence of Project, Filter, and Limit nodes.
Key changes include:
Distribution Property Handling Enhancements:
- Added logic in
ChildrenPropertiesRegulatorto check if a child node under aPhysicalProjectorPhysicalFilterrequires a "must shuffle" distribution, and to adjust the children’s properties accordingly. This is done via the newmustShuffleUnderProjectOrFiltermethod. [1] [2] [3] - Included
PhysicalLimitin the set of nodes that can trigger a shuffle requirement, by updating imports and logic. [1] [2]
Testing Improvements:
- Added a new test class
ChildrenPropertiesRegulatorTest.javawith detailed unit tests for the handling of "must shuffle" properties underProject,Filter, andLimitnodes. These tests use mocks to simulate various plan trees and assert correct distribution specification propagation.
Regression Test Coverage:
- Added a new regression test in
cte.groovyto verify correct behavior when multipleProjectnodes are present on a CTE consumer, ensuring the planner handles such cases as expected.
These changes collectively make the planner more robust in handling complex plan trees with respect to distribution requirements, and ensure correctness through thorough testing.
Release note
None
Check List (For Author)
-
Test
- [x] Regression test
- [x] Unit Test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
- [ ] Previous test can cover this change.
- [ ] No code files have been changed.
- [ ] Other reason
-
Behavior changed:
- [x] No.
- [ ] Yes.
-
Does this need documentation?
- [x] No.
- [ ] Yes.
Check List (For Reviewer who merge this PR)
- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label
Thank you for your contribution to Apache Doris. Don't know what should be done next? See How to process your PR.
Please clearly describe your PR:
- What problem was fixed (it's best to include specific error reporting information). How it was fixed.
- Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
- What features were added. Why was this function added?
- Which code was refactored and why was this part of the code refactored?
- Which functions were optimized and what is the difference before and after the optimization?
run buildall
TPC-H: Total hot run time: 35797 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit d64255267ffa89a67bfb66f877741b71e52ad444, data reload: false
------ Round 1 ----------------------------------
q1 17588 4210 4116 4116
q2 2043 362 255 255
q3 10170 1369 746 746
q4 10239 933 327 327
q5 7537 2175 1950 1950
q6 193 178 140 140
q7 1023 870 719 719
q8 9395 1498 1149 1149
q9 6980 5389 5449 5389
q10 6799 2394 2017 2017
q11 518 343 302 302
q12 679 727 595 595
q13 17773 3687 3056 3056
q14 288 307 292 292
q15 596 512 527 512
q16 947 944 889 889
q17 723 876 470 470
q18 7723 7284 7043 7043
q19 1833 972 634 634
q20 406 372 255 255
q21 4244 3965 4021 3965
q22 1055 989 976 976
Total cold run time: 108752 ms
Total hot run time: 35797 ms
----- Round 2, with runtime_filter_mode=off -----
q1 4212 4110 4122 4110
q2 340 421 335 335
q3 2122 2697 2317 2317
q4 1383 1771 1313 1313
q5 4260 4217 4177 4177
q6 219 172 132 132
q7 1908 1842 1701 1701
q8 2568 2383 2393 2383
q9 7039 7034 6940 6940
q10 2961 3126 2713 2713
q11 588 507 480 480
q12 631 726 550 550
q13 3280 3677 3066 3066
q14 288 284 283 283
q15 554 511 536 511
q16 907 906 870 870
q17 1144 1466 1377 1377
q18 7373 7198 7002 7002
q19 861 850 851 850
q20 1891 1983 1888 1888
q21 4814 4423 4242 4242
q22 1100 1031 998 998
Total cold run time: 50443 ms
Total hot run time: 48238 ms
TPC-DS: Total hot run time: 182418 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit d64255267ffa89a67bfb66f877741b71e52ad444, data reload: false
query5 5751 643 473 473
query6 335 234 226 226
query7 4218 464 276 276
query8 293 256 261 256
query9 8784 2586 2605 2586
query10 552 381 341 341
query11 15513 15010 14622 14622
query12 179 117 116 116
query13 1255 518 406 406
query14 6654 3309 3009 3009
query14_1 2917 2972 2872 2872
query15 216 198 185 185
query16 877 488 460 460
query17 1171 735 626 626
query18 2728 451 358 358
query19 231 230 210 210
query20 127 116 112 112
query21 220 143 116 116
query22 3979 4108 3950 3950
query23 16513 16462 16082 16082
query23_1 16095 16137 15990 15990
query24 7312 1692 1228 1228
query24_1 1267 1231 1260 1231
query25 541 465 420 420
query26 1251 274 162 162
query27 2735 467 312 312
query28 4465 2182 2154 2154
query29 814 604 446 446
query30 329 246 217 217
query31 836 700 640 640
query32 78 75 69 69
query33 562 332 284 284
query34 906 929 548 548
query35 786 833 719 719
query36 865 926 841 841
query37 138 91 74 74
query38 3874 3962 3839 3839
query39 780 734 722 722
query39_1 699 726 699 699
query40 232 142 129 129
query41 69 66 64 64
query42 115 111 107 107
query43 446 450 418 418
query44 1330 767 756 756
query45 192 192 181 181
query46 880 988 629 629
query47 1657 1703 1604 1604
query48 325 335 259 259
query49 646 440 365 365
query50 664 305 224 224
query51 3847 3890 3825 3825
query52 109 116 100 100
query53 332 355 292 292
query54 305 261 264 261
query55 82 81 77 77
query56 306 318 307 307
query57 1137 1139 1092 1092
query58 282 260 261 260
query59 2365 2602 2363 2363
query60 331 311 302 302
query61 166 162 165 162
query62 693 689 657 657
query63 333 302 303 302
query64 4979 1383 1128 1128
query65 4058 3975 3948 3948
query66 1378 467 348 348
query67 15120 15100 15073 15073
query68 5798 1028 750 750
query69 501 363 319 319
query70 1082 999 985 985
query71 383 331 290 290
query72 6459 5025 5036 5025
query73 754 676 309 309
query74 8857 8852 8629 8629
query75 3577 3564 3194 3194
query76 3821 1178 733 733
query77 531 401 295 295
query78 9638 9953 8973 8973
query79 1043 873 618 618
query80 1240 671 556 556
query81 553 268 239 239
query82 408 134 105 105
query83 369 264 244 244
query84 262 125 96 96
query85 962 540 469 469
query86 445 295 307 295
query87 4077 4040 3990 3990
query88 3202 2292 2283 2283
query89 465 425 390 390
query90 1901 162 163 162
query91 179 173 142 142
query92 71 65 66 65
query93 1110 933 571 571
query94 538 313 258 258
query95 557 391 326 326
query96 595 489 209 209
query97 2614 2674 2545 2545
query98 213 198 197 197
query99 1317 1306 1250 1250
Total cold run time: 262887 ms
Total hot run time: 182418 ms
ClickBench: Total hot run time: 27.26 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit d64255267ffa89a67bfb66f877741b71e52ad444, data reload: false
query1 0.05 0.05 0.05
query2 0.11 0.05 0.05
query3 0.26 0.08 0.09
query4 1.61 0.11 0.11
query5 0.28 0.25 0.26
query6 1.17 0.65 0.63
query7 0.02 0.02 0.02
query8 0.06 0.04 0.04
query9 0.56 0.50 0.51
query10 0.58 0.57 0.55
query11 0.17 0.11 0.12
query12 0.15 0.12 0.12
query13 0.62 0.60 0.61
query14 0.99 0.98 1.00
query15 0.82 0.79 0.81
query16 0.41 0.38 0.38
query17 1.00 1.02 1.06
query18 0.23 0.22 0.21
query19 1.90 1.83 1.84
query20 0.02 0.02 0.02
query21 15.45 0.30 0.16
query22 4.77 0.06 0.05
query23 16.16 0.29 0.11
query24 1.41 0.24 0.50
query25 0.06 0.05 0.07
query26 0.14 0.14 0.14
query27 0.06 0.05 0.05
query28 3.71 1.23 1.03
query29 12.61 4.09 3.20
query30 0.28 0.15 0.13
query31 2.82 0.63 0.39
query32 3.23 0.56 0.45
query33 3.01 3.00 3.02
query34 16.82 5.24 4.54
query35 4.60 4.65 4.57
query36 0.66 0.51 0.49
query37 0.11 0.07 0.06
query38 0.08 0.04 0.03
query39 0.04 0.02 0.03
query40 0.17 0.14 0.14
query41 0.10 0.03 0.02
query42 0.04 0.03 0.02
query43 0.04 0.03 0.04
Total cold run time: 97.38 s
Total hot run time: 27.26 s
FE Regression Coverage Report
Increment line coverage 60.00% (12/20) :tada:
Increment coverage report
Complete coverage report
run buildall
TPC-H: Total hot run time: 36495 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f924bb4d88b688780456c2fa8f451ba9f7665b61, data reload: false
------ Round 1 ----------------------------------
q1 17648 4148 4059 4059
q2 1999 351 237 237
q3 10195 1283 716 716
q4 10225 855 311 311
q5 7571 2103 1954 1954
q6 187 170 136 136
q7 1028 865 731 731
q8 9352 1401 1131 1131
q9 7012 5349 5361 5349
q10 6833 2400 1971 1971
q11 529 325 304 304
q12 654 718 570 570
q13 17801 3705 3070 3070
q14 297 316 302 302
q15 614 536 534 534
q16 712 702 640 640
q17 697 863 492 492
q18 7572 7785 8086 7785
q19 1160 1035 644 644
q20 424 385 250 250
q21 4682 4408 4235 4235
q22 1186 1074 1111 1074
Total cold run time: 108378 ms
Total hot run time: 36495 ms
----- Round 2, with runtime_filter_mode=off -----
q1 4532 4246 4310 4246
q2 322 401 349 349
q3 2464 3104 2464 2464
q4 1363 1892 1462 1462
q5 4550 4499 4418 4418
q6 207 173 121 121
q7 2005 1884 1791 1791
q8 2662 2477 2463 2463
q9 7487 7728 7459 7459
q10 2907 3126 2636 2636
q11 565 495 478 478
q12 638 705 602 602
q13 3277 3623 3069 3069
q14 256 283 265 265
q15 535 497 500 497
q16 624 696 593 593
q17 1097 1361 1441 1361
q18 7327 7202 7107 7107
q19 814 770 803 770
q20 1892 1966 1859 1859
q21 4565 4290 4186 4186
q22 1117 1013 993 993
Total cold run time: 51206 ms
Total hot run time: 49189 ms
TPC-DS: Total hot run time: 178500 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f924bb4d88b688780456c2fa8f451ba9f7665b61, data reload: false
query5 5217 621 455 455
query6 335 236 224 224
query7 4218 459 264 264
query8 331 255 228 228
query9 8773 2569 2575 2569
query10 564 382 326 326
query11 15369 14735 14727 14727
query12 181 118 118 118
query13 1265 547 380 380
query14 6336 3269 3022 3022
query14_1 2901 2915 2877 2877
query15 205 202 181 181
query16 872 504 461 461
query17 1111 692 589 589
query18 2702 438 345 345
query19 234 226 205 205
query20 120 115 108 108
query21 226 138 115 115
query22 3879 3897 3800 3800
query23 16600 16193 15945 15945
query23_1 16008 16065 16084 16065
query24 7326 1641 1249 1249
query24_1 1271 1249 1261 1249
query25 553 470 426 426
query26 1242 270 155 155
query27 2762 466 305 305
query28 4488 2136 2135 2135
query29 816 549 437 437
query30 350 237 214 214
query31 807 701 605 605
query32 76 72 70 70
query33 539 331 296 296
query34 895 913 546 546
query35 804 811 739 739
query36 866 920 829 829
query37 139 101 82 82
query38 2892 2882 2879 2879
query39 750 736 733 733
query39_1 715 696 697 696
query40 226 137 120 120
query41 68 65 63 63
query42 108 114 106 106
query43 440 446 402 402
query44 1327 746 751 746
query45 195 189 182 182
query46 878 990 619 619
query47 1659 1692 1609 1609
query48 312 325 257 257
query49 647 427 360 360
query50 666 287 227 227
query51 3835 3827 3816 3816
query52 109 111 103 103
query53 325 355 294 294
query54 304 266 260 260
query55 79 81 76 76
query56 291 315 307 307
query57 1126 1149 1074 1074
query58 275 266 256 256
query59 2305 2464 2414 2414
query60 332 333 317 317
query61 199 187 196 187
query62 705 684 635 635
query63 334 302 309 302
query64 5127 1453 1164 1164
query65 4025 3961 3958 3958
query66 1404 474 337 337
query67 15102 14879 14704 14704
query68 4691 1038 740 740
query69 501 367 324 324
query70 1092 983 991 983
query71 368 316 296 296
query72 6172 4932 4916 4916
query73 665 555 311 311
query74 8778 8781 8641 8641
query75 3171 3136 2802 2802
query76 3946 1145 759 759
query77 528 395 298 298
query78 9474 9770 8873 8873
query79 1018 875 615 615
query80 1063 650 558 558
query81 537 267 241 241
query82 405 139 105 105
query83 277 252 243 243
query84 262 121 102 102
query85 925 527 487 487
query86 339 283 279 279
query87 3101 3094 3016 3016
query88 3262 2299 2281 2281
query89 473 417 392 392
query90 1956 155 157 155
query91 175 177 154 154
query92 72 70 77 70
query93 1045 905 563 563
query94 543 313 273 273
query95 567 366 323 323
query96 584 470 204 204
query97 2273 2301 2221 2221
query98 206 200 195 195
query99 1264 1316 1195 1195
Total cold run time: 256541 ms
Total hot run time: 178500 ms
ClickBench: Total hot run time: 27.94 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit f924bb4d88b688780456c2fa8f451ba9f7665b61, data reload: false
query1 0.05 0.04 0.05
query2 0.09 0.05 0.05
query3 0.26 0.09 0.08
query4 1.61 0.11 0.12
query5 0.27 0.25 0.26
query6 1.15 0.64 0.63
query7 0.04 0.03 0.02
query8 0.05 0.04 0.04
query9 0.58 0.51 0.50
query10 0.55 0.54 0.57
query11 0.15 0.10 0.11
query12 0.16 0.11 0.12
query13 0.62 0.61 0.62
query14 1.00 0.99 0.98
query15 0.82 0.79 0.82
query16 0.41 0.40 0.42
query17 1.03 1.01 1.05
query18 0.24 0.21 0.21
query19 1.97 1.90 1.81
query20 0.02 0.01 0.02
query21 15.46 0.29 0.16
query22 4.91 0.05 0.05
query23 16.03 0.31 0.11
query24 0.95 0.67 0.68
query25 0.11 0.08 0.07
query26 0.15 0.15 0.14
query27 0.05 0.08 0.05
query28 4.68 1.20 1.02
query29 12.61 4.07 3.36
query30 0.28 0.14 0.13
query31 2.82 0.64 0.39
query32 3.24 0.56 0.47
query33 2.98 3.11 3.01
query34 17.09 5.17 4.60
query35 4.62 4.62 4.57
query36 0.69 0.50 0.49
query37 0.11 0.07 0.06
query38 0.07 0.04 0.04
query39 0.04 0.04 0.03
query40 0.17 0.14 0.13
query41 0.09 0.04 0.03
query42 0.05 0.03 0.03
query43 0.04 0.04 0.04
Total cold run time: 98.31 s
Total hot run time: 27.94 s
FE UT Coverage Report
Increment line coverage 78.26% (18/23) :tada:
Increment coverage report
Complete coverage report
FE Regression Coverage Report
Increment line coverage 60.87% (14/23) :tada:
Increment coverage report
Complete coverage report
FE Regression Coverage Report
Increment line coverage 60.87% (14/23) :tada:
Increment coverage report
Complete coverage report
PR approved by at least one committer and no changes requested.
PR approved by anyone and no changes requested.
FE Regression Coverage Report
Increment line coverage 60.87% (14/23) :tada:
Increment coverage report
Complete coverage report