gporca icon indicating copy to clipboard operation
gporca copied to clipboard

Orca generates subplan for uncorrelated subquery

Open hsyuan opened this issue 7 years ago • 2 comments

Repro:

create table foo(a int, b int);
create table bar(a int, b int);

explain select * from foo where a >= (select avg(b) from bar);
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 Result  (cost=0.00..1324033.45 rows=1 width=8)
   Filter: foo.a::numeric >= ((SubPlan 1))
   ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=8)
         ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=8)
   SubPlan 1
     ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
           ->  Materialize  (cost=0.00..431.00 rows=1 width=8)
                 ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..431.00 rows=1 width=8)
                       ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
                             ->  Table Scan on bar  (cost=0.00..431.00 rows=34 width=4)
 Settings:  optimizer=on
 Optimizer status: PQO version 2.54.1
(12 rows)

Planner generates initplan, which is better:

explain select * from foo where a >= (select avg(b) from bar);
                                           QUERY PLAN
-------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice2; segments: 3)  (cost=4.58..1396.09 rows=28700 width=8)
   ->  Seq Scan on foo  (cost=4.58..1396.09 rows=9567 width=8)
         Filter: a::numeric >= $0
         InitPlan 1 (returns $0)  (slice3)
           ->  Aggregate  (cost=4.57..4.58 rows=1 width=32)
                 ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=4.50..4.55 rows=1 width=32)
                       ->  Aggregate  (cost=4.50..4.51 rows=1 width=32)
                             ->  Seq Scan on bar  (cost=0.00..4.00 rows=34 width=4)
 Settings:  optimizer=off
 Optimizer status: legacy query optimizer
(10 rows)

Orca should generates plan with either initplan or nestloop join.

hsyuan avatar Jan 30 '18 18:01 hsyuan

AFAIK ORCA does not generate initplans. If we insert some data into these relations or increase the number of segments, then ORCA does pick a plan with nestloop join

Insert Data

insert into foo values(1), (2);
insert into bar select i from generate_series(1,1000)i;
analyze;

explain select * from foo where a >= (select avg(b) from bar);
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..1324042.42 rows=1 width=8)
   Join Filter: foo.a::numeric >= (pg_catalog.avg((avg(bar.b))))
   ->  Aggregate  (cost=0.00..431.01 rows=1 width=8)
         ->  Gather Motion 3:1  (slice2; segments: 3)  (cost=0.00..431.01 rows=1 width=8)
               ->  Aggregate  (cost=0.00..431.01 rows=1 width=8)
                     ->  Table Scan on bar  (cost=0.00..431.01 rows=334 width=4)
   ->  Materialize  (cost=0.00..431.00 rows=1 width=8)
         ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.00 rows=2 width=8)
               ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=8)
 Settings:  optimizer=on
 Optimizer status: PQO version 2.54.2
(11 rows)

Increase number of segmnets

truncate foo, bar;
set optimizer_segments=40;

explain select * from foo where a >= (select avg(b) from bar);
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice4; segments: 3)  (cost=0.00..1324033.11 rows=1 width=8)
   ->  Nested Loop  (cost=0.00..1324033.11 rows=1 width=8)
         Join Filter: foo.a::numeric >= (pg_catalog.avg((avg(bar.b))))
         ->  Broadcast Motion 3:3  (slice3; segments: 3)  (cost=0.00..431.00 rows=14 width=8)
               ->  Table Scan on foo  (cost=0.00..431.00 rows=1 width=8)
         ->  Materialize  (cost=0.00..431.00 rows=1 width=8)
               ->  Redistribute Motion 1:3  (slice2)  (cost=0.00..431.00 rows=1 width=8)
                     ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
                           ->  Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.00 rows=1 width=8)
                                 ->  Aggregate  (cost=0.00..431.00 rows=1 width=8)
                                       ->  Table Scan on bar  (cost=0.00..431.00 rows=1 width=4)
 Settings:  optimizer=on; optimizer_segments=40
 Optimizer status: PQO version 2.54.2
(13 rows)

dhanashreek89 avatar Feb 02 '18 19:02 dhanashreek89

But even with some data, it can still generate the subplan.

hsyuan avatar Mar 08 '18 21:03 hsyuan