gporca
gporca copied to clipboard
Orca generates subplan for uncorrelated subquery
Repro:
create table foo(a int, b int);
create table bar(a int, b int);
explain select * from foo where a >= (select avg(b) from bar);
QUERY PLAN
--------------------------------------------------------------------------------------------------
Result (cost=0.00..1324033.45 rows=1 width=8)
Filter: foo.a::numeric >= ((SubPlan 1))
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8)
-> Table Scan on foo (cost=0.00..431.00 rows=1 width=8)
SubPlan 1
-> Aggregate (cost=0.00..431.00 rows=1 width=8)
-> Materialize (cost=0.00..431.00 rows=1 width=8)
-> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..431.00 rows=1 width=8)
-> Aggregate (cost=0.00..431.00 rows=1 width=8)
-> Table Scan on bar (cost=0.00..431.00 rows=34 width=4)
Settings: optimizer=on
Optimizer status: PQO version 2.54.1
(12 rows)
Planner generates initplan, which is better:
explain select * from foo where a >= (select avg(b) from bar);
QUERY PLAN
-------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice2; segments: 3) (cost=4.58..1396.09 rows=28700 width=8)
-> Seq Scan on foo (cost=4.58..1396.09 rows=9567 width=8)
Filter: a::numeric >= $0
InitPlan 1 (returns $0) (slice3)
-> Aggregate (cost=4.57..4.58 rows=1 width=32)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=4.50..4.55 rows=1 width=32)
-> Aggregate (cost=4.50..4.51 rows=1 width=32)
-> Seq Scan on bar (cost=0.00..4.00 rows=34 width=4)
Settings: optimizer=off
Optimizer status: legacy query optimizer
(10 rows)
Orca should generates plan with either initplan or nestloop join.
AFAIK ORCA does not generate initplans. If we insert some data into these relations or increase the number of segments, then ORCA does pick a plan with nestloop join
Insert Data
insert into foo values(1), (2);
insert into bar select i from generate_series(1,1000)i;
analyze;
explain select * from foo where a >= (select avg(b) from bar);
QUERY PLAN
------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..1324042.42 rows=1 width=8)
Join Filter: foo.a::numeric >= (pg_catalog.avg((avg(bar.b))))
-> Aggregate (cost=0.00..431.01 rows=1 width=8)
-> Gather Motion 3:1 (slice2; segments: 3) (cost=0.00..431.01 rows=1 width=8)
-> Aggregate (cost=0.00..431.01 rows=1 width=8)
-> Table Scan on bar (cost=0.00..431.01 rows=334 width=4)
-> Materialize (cost=0.00..431.00 rows=1 width=8)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=2 width=8)
-> Table Scan on foo (cost=0.00..431.00 rows=1 width=8)
Settings: optimizer=on
Optimizer status: PQO version 2.54.2
(11 rows)
Increase number of segmnets
truncate foo, bar;
set optimizer_segments=40;
explain select * from foo where a >= (select avg(b) from bar);
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Gather Motion 3:1 (slice4; segments: 3) (cost=0.00..1324033.11 rows=1 width=8)
-> Nested Loop (cost=0.00..1324033.11 rows=1 width=8)
Join Filter: foo.a::numeric >= (pg_catalog.avg((avg(bar.b))))
-> Broadcast Motion 3:3 (slice3; segments: 3) (cost=0.00..431.00 rows=14 width=8)
-> Table Scan on foo (cost=0.00..431.00 rows=1 width=8)
-> Materialize (cost=0.00..431.00 rows=1 width=8)
-> Redistribute Motion 1:3 (slice2) (cost=0.00..431.00 rows=1 width=8)
-> Aggregate (cost=0.00..431.00 rows=1 width=8)
-> Gather Motion 3:1 (slice1; segments: 3) (cost=0.00..431.00 rows=1 width=8)
-> Aggregate (cost=0.00..431.00 rows=1 width=8)
-> Table Scan on bar (cost=0.00..431.00 rows=1 width=4)
Settings: optimizer=on; optimizer_segments=40
Optimizer status: PQO version 2.54.2
(13 rows)
But even with some data, it can still generate the subplan.