gporca icon indicating copy to clipboard operation
gporca copied to clipboard

Orca generates redundant motion for left outer join with multiple distribution keys

Open hsyuan opened this issue 7 years ago • 0 comments

Repro:

create table foo(a varchar, b int, c int) distributed by(a,b);
create table bar(a varchar, b int, c int) distributed by(a,b);
create table zoo(a varchar, b int, c int) with(appendonly=true) distributed by(a,b);

explain select * from foo left join bar on foo.a = bar.a and foo.b = bar.b
left join zoo on zoo.a=foo.a and zoo.b=bar.b;

 Gather Motion 3:1  (slice4; segments: 3)  (cost=0.00..1293.00 rows=3 width=48)
   ->  Hash Left Join  (cost=0.00..1293.00 rows=1 width=48)
         Hash Cond: foo.a::text = zoo.a::text AND bar.b = zoo.b
         ->  Hash Left Join  (cost=0.00..862.00 rows=1 width=32)
               Hash Cond: foo.a::text = bar.a::text AND foo.b = bar.b
               ->  Redistribute Motion 3:3  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=16)
                     Hash Key: foo.a
                     ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=16)
               ->  Hash  (cost=431.00..431.00 rows=1 width=16)
                     ->  Redistribute Motion 3:3  (slice2; segments: 3)  (cost=0.00..431.00 rows=1 width=16)
                           Hash Key: bar.a::text
                           ->  Table Scan on bar  (cost=0.00..431.00 rows=1 width=16)
         ->  Hash  (cost=431.00..431.00 rows=1 width=16)
               ->  Redistribute Motion 3:3  (slice3; segments: 3)  (cost=0.00..431.00 rows=1 width=16)
                     Hash Key: zoo.a::text
                     ->  Table Scan on zoo  (cost=0.00..431.00 rows=1 width=16)
 Settings:  enable_nestloop=on; optimizer=on
 Optimizer status: PQO version 2.53.11
(18 rows)

But if we change the last condition from zoo.b=bar.b to zoo.b=foo.b (same effect), it generates the following plan without redistribute motion:

explain select * from foo left join bar on foo.a = bar.a and foo.b = bar.b
left join zoo on zoo.a=foo.a and zoo.b=foo.b;

 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..1293.00 rows=3 width=48)
   ->  Hash Left Join  (cost=0.00..1293.00 rows=1 width=48)
         Hash Cond: foo.a::text = zoo.a::text AND foo.b = zoo.b
         ->  Hash Left Join  (cost=0.00..862.00 rows=1 width=32)
               Hash Cond: foo.a::text = bar.a::text AND foo.b = bar.b
               ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=16)
               ->  Hash  (cost=431.00..431.00 rows=1 width=16)
                     ->  Table Scan on bar  (cost=0.00..431.00 rows=1 width=16)
         ->  Hash  (cost=431.00..431.00 rows=1 width=16)
               ->  Table Scan on zoo  (cost=0.00..431.00 rows=1 width=16)
 Settings:  enable_nestloop=on; optimizer=on
 Optimizer status: PQO version 2.53.11
(12 rows)

hsyuan avatar Jan 19 '18 22:01 hsyuan