gpdb icon indicating copy to clipboard operation
gpdb copied to clipboard

Derive Combined Hashed Spec For Outer Joins - Patch

Open THANATOSLAVA opened this issue 3 years ago • 0 comments

Issue: Community reports regression that post-fc662ea plans had redundant redistribution motion in inner joins

Root cause: Blanket change of Nulls Colocation to false in computing a matching hashed distribution spec

Solution: In matching a hashed distribution spec in inner join operations, set Nulls Colocation to true; and in matching a hashed distribution spec in outer join operations, set Nulls Colocation to false. This reflects the Nulls Colocation property required for / delivered by the outer relation in hash join operations.

Implementation: [CPhysicalHashJoin] -- Return Nulls Colocation in spec matching for inner joins, and Non Nulls Colocation for outer joins. [CPhysicalLeftOuterHashJoin] -- Add TODO comment. Left outer join should be able to return a combined hash spec even when only one relation is hash distributed. [minidump] -- Space size change only. Added user's example to verify inner join matches the outer relation's Nulls Colocation.

(cherry picked from commit 35f77dc0a5bd64aedc72fad6b861562b9c167bcd)

Here are some reminders before you submit the pull request

  • [ ] Add tests for the change
  • [ ] Document changes
  • [ ] Communicate in the mailing list if needed
  • [ ] Pass make installcheck
  • [ ] Review a PR in return to support the community

THANATOSLAVA avatar Aug 08 '22 22:08 THANATOSLAVA

This needs to go to master first. Correct?

vraghavan78 avatar Aug 18 '22 09:08 vraghavan78

@vraghavan78 Yes. PR for master: https://github.com/greenplum-db/gpdb/pull/13957

gpopt avatar Aug 18 '22 21:08 gpopt