gporca
gporca copied to clipboard
Orca generates redundant motion for left outer join with multiple distribution keys
Repro:
create table foo(a varchar, b int, c int) distributed by(a,b);
create table bar(a varchar, b int, c int) distributed by(a,b);
create table zoo(a varchar, b int, c int) with(appendonly=true) distributed by(a,b);
explain select * from foo left join bar on foo.a = bar.a and foo.b = bar.b
left join zoo on zoo.a=foo.a and zoo.b=bar.b;
Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..1293.00 rows=3 width=48)
-> Hash Left Join (cost=0.00..1293.00 rows=1 width=48)
Hash Cond: foo.a::text = zoo.a::text AND bar.b = zoo.b
-> Hash Left Join (cost=0.00..862.00 rows=1 width=32)
Hash Cond: foo.a::text = bar.a::text AND foo.b = bar.b
-> Redistribute Motion 3:3 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=16)
Hash Key: foo.a
-> Table Scan on foo (cost=0.00..431.00 rows=1 width=16)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Redistribute Motion 3:3 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=16)
Hash Key: bar.a::text
-> Table Scan on bar (cost=0.00..431.00 rows=1 width=16)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Redistribute Motion 3:3 (slice3; segments: 3) (cost=0.00..431.00 rows=1 width=16)
Hash Key: zoo.a::text
-> Table Scan on zoo (cost=0.00..431.00 rows=1 width=16)
Settings: enable_nestloop=on; optimizer=on
Optimizer status: PQO version 2.53.11
(18 rows)
But if we change the last condition from zoo.b=bar.b
to zoo.b=foo.b
(same effect), it generates the following plan without redistribute motion:
explain select * from foo left join bar on foo.a = bar.a and foo.b = bar.b
left join zoo on zoo.a=foo.a and zoo.b=foo.b;
Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..1293.00 rows=3 width=48)
-> Hash Left Join (cost=0.00..1293.00 rows=1 width=48)
Hash Cond: foo.a::text = zoo.a::text AND foo.b = zoo.b
-> Hash Left Join (cost=0.00..862.00 rows=1 width=32)
Hash Cond: foo.a::text = bar.a::text AND foo.b = bar.b
-> Table Scan on foo (cost=0.00..431.00 rows=1 width=16)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Table Scan on bar (cost=0.00..431.00 rows=1 width=16)
-> Hash (cost=431.00..431.00 rows=1 width=16)
-> Table Scan on zoo (cost=0.00..431.00 rows=1 width=16)
Settings: enable_nestloop=on; optimizer=on
Optimizer status: PQO version 2.53.11
(12 rows)