Remove redundant predicates after transitive closures

Open DevChattopadhyay opened this issue 2 years ago • 0 comments

Issue:

Current implementation of ORCA does not remove the redundant predicates after transitive closure.

Solution:

This PR is trying to remove the redundant predicates based on the following steps.

After the normalization step the predicates are already pushed down the tree. So they are redundant in the join condition.
If the child of the join is a EopScalarCmp, we are not checking for redundancy because we need one child for the Hash join condition.
If the child of the join is EopScalarBoolOp we are iterating through each child of EopScalarBoolOp and if its a EopScalarCmp with equlity type, we are checking if the value of that column is a constant.
If it's a constant then it can be removed as it has been already pushed down the tree in the previous normalization step.
If a condition arises when all the childs are redundant then we are not removing all the childs as this will boil down to a nested loop. So in order to do a Hash join we are keeping one child even if its redundant based on if the column is a distribution key.

Setup:

create table foo(a text, b text); create table bar(c text, d text); explain select * from foo join bar on foo.a=bar.c and foo.b=bar.d where bar.d='cc';

Existing Behaviour:

                            QUERY PLAN
-------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=1 width=32)
   ->  Hash Join  (cost=0.00..862.00 rows=1 width=32)
         Hash Cond: ((foo.a = bar.c) AND (foo.b = bar.d))
         ->  Seq Scan on foo  (cost=0.00..431.00 rows=1 width=16)
               Filter: (b = 'cc'::text)
         ->  Hash  (cost=431.00..431.00 rows=1 width=16)
               ->  Seq Scan on bar  (cost=0.00..431.00 rows=1 width=16)
                     Filter: (d = 'cc'::text)
 Optimizer: Pivotal Optimizer (GPORCA)
(9 rows)

New Behaviour:

                                  QUERY PLAN
-------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=1 width=32)
   ->  Hash Join  (cost=0.00..862.00 rows=1 width=32)
         Hash Cond: (foo.a = bar.c)
         ->  Seq Scan on foo  (cost=0.00..431.00 rows=1 width=16)
               Filter: (b = 'cc'::text)
         ->  Hash  (cost=431.00..431.00 rows=1 width=16)
               ->  Seq Scan on bar  (cost=0.00..431.00 rows=1 width=16)
                     Filter: (d = 'cc'::text)
 Optimizer: Pivotal Optimizer (GPORCA)
(9 rows)

Here are some reminders before you submit the pull request

[ ] Add tests for the change
[ ] Document changes
[ ] Communicate in the mailing list if needed
[ ] Pass make installcheck
[ ] Review a PR in return to support the community

Sep 02 '22 08:09 DevChattopadhyay

gpdb gpdb copied to clipboard

Remove redundant predicates after transitive closures

Issue:

Solution:

Setup:

Existing Behaviour:

New Behaviour:

Here are some reminders before you submit the pull request

gpdb
gpdb copied to clipboard