cloudberry
cloudberry copied to clipboard
Modify the orca optimizer's processing of unionall distribution strat…
Change logs
The orca optimizer currently returns the ANY policy for the first child of a unionall-like node, which will result in Gather Motion for the downstream children and a 1:n Redistribution for the upstream.
for example:
-> Redistribute Motion 1:3 (slice2)
-> Append (cost=0.00..863.91 rows=18001 width=12)
-> Finalize Vec Aggregate
-> Gather Motion 3:1 (slice3; segments: 3)
...
-> Gather Motion 3:1 (slice4; segments: 3)
-> HashAggregate
after:
-> Append (cost=0.00..863.91 rows=18001 width=12)
-> Result (cost=0.00..431.06 rows=1 width=12)
-> Redistribute Motion 1:3 (slice2)
-> Finalize Aggregate
-> Gather Motion 3:1 (slice3; segments: 3)
...
-> HashAggregate
When there are many nodes, the first plan will cause performance bottlenecks and need to be modified. Fortunately, the gpdb community has also modified th is. Commit is 0cd056a0a3d3c30a1d6d4479e67802b6673118c7.
Why are the changes needed?
1.Affect performance 2.Plan is unreasonable
Does this PR introduce any user-facing change?
yes, tpcds 167 query.
How was this patch tested?
yes.
Contributor's Checklist
Here are some reminders and checklists before/when submitting your pull request, please check them:
- [x] Make sure your Pull Request has a clear title and commit message. You can take git-commit template as a reference.
- [x] Sign the Contributor License Agreement as prompted for your first-time contribution(One-time setup).
- [x] Learn the coding contribution guide, including our code conventions, workflow and more.
- [x] List your communication in the GitHub Issues or Discussions (if has or needed).
- [x] Document changes.
- [x] Add tests for the change
- [x] Pass
make installcheck
- [x] Pass
make -C src/test installcheck-cbdb-parallel
- [x] Feel free to @cloudberrydb/dev team for review and approval when your PR is ready🥳