cloudberry
cloudberry copied to clipboard
[Bug] ORCA cannot collect and use the table FK info
Apache Cloudberry version
No response
What happened
TPCDS 39 SQL
explain analyze with inv as
(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy
,stdev,mean, case mean when 0 then null else stdev/mean end cov
from(select w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy
,stddev_samp(inv_quantity_on_hand) stdev,avg(inv_quantity_on_hand) mean
from inventory
,item
,warehouse
,date_dim
where inv_item_sk = i_item_sk
and inv_warehouse_sk = w_warehouse_sk
and inv_date_sk = d_date_sk
and d_year =1999
group by w_warehouse_name,w_warehouse_sk,i_item_sk,d_moy) foo
where case mean when 0 then 0 else stdev/mean end > 1)
select inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean, inv1.cov
,inv2.w_warehouse_sk,inv2.i_item_sk,inv2.d_moy,inv2.mean, inv2.cov
from inv inv1,inv inv2
where inv1.i_item_sk = inv2.i_item_sk
and inv1.w_warehouse_sk = inv2.w_warehouse_sk
and inv1.d_moy=4
and inv2.d_moy=4+1
order by inv1.w_warehouse_sk,inv1.i_item_sk,inv1.d_moy,inv1.mean,inv1.cov
,inv2.d_moy,inv2.mean, inv2.cov;
the result of EXPLAIN ANALYZE
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Gather Motion 4:1 (slice1; segments: 4) (cost=0.00..10611.03 rows=1 width=56) (actual time=271918.249..271920.951 rows=1642 loops=1)
Merge Key: share0_ref3.w_warehouse_sk, share0_ref3.i_item_sk, share0_ref3.d_moy, share0_ref3.mean, share0_ref3.cov, share0_ref2.d_moy, share0_ref2.mean, share0_ref2.cov
-> Sort (cost=0.00..10611.03 rows=1 width=56) (actual time=271908.661..271908.745 rows=438 loops=1)
Sort Key: share0_ref3.w_warehouse_sk, share0_ref3.i_item_sk, share0_ref3.d_moy, share0_ref3.mean, share0_ref3.cov, share0_ref2.d_moy, share0_ref2.mean, share0_ref2.cov
Sort Method: quicksort Memory: 342kB
-> Sequence (cost=0.00..10611.03 rows=1 width=56) (actual time=271899.415..271908.168 rows=438 loops=1)
-> Shared Scan (share slice:id 1:0) (cost=0.00..9749.03 rows=1 width=1) (actual time=271396.666..271399.381 rows=15195 loops=1)
-> Result (cost=0.00..9749.03 rows=1 width=36) (actual time=270462.927..271390.897 rows=15195 loops=1)
Filter: (CASE (avg(inventory.inv_quantity_on_hand)) WHEN '0'::numeric THEN '0'::numeric ELSE ((stddev_samp(inventory.inv_quantity_on_hand)) / (avg(inventory.inv_quantity_on_hand))) END > '1'::numeric)
-> GroupAggregate (cost=0.00..9749.03 rows=1 width=36) (actual time=270702.186..271702.301 rows=255457 loops=1)
Group Key: warehouse.w_warehouse_name, warehouse.w_warehouse_sk, item.i_item_sk, date_dim.d_moy
-> Sort (cost=0.00..9749.03 rows=1 width=24) (actual time=270702.115..270882.301 rows=1149680 loops=1)
Sort Key: warehouse.w_warehouse_name, warehouse.w_warehouse_sk, item.i_item_sk, date_dim.d_moy
Sort Method: quicksort Memory: 576720kB
-> Redistribute Motion 4:4 (slice2; segments: 4) (cost=0.00..9749.03 rows=1 width=24) (actual time=266582.489..268482.509 rows=1149680 loops=1)
Hash Key: warehouse.w_warehouse_name, warehouse.w_warehouse_sk, item.i_item_sk, date_dim.d_moy
-> Hash Join (cost=0.00..9749.03 rows=1 width=24) (actual time=266579.279..267906.337 rows=1155600 loops=1)
Hash Cond: (item.i_item_sk = inventory.inv_item_sk)
Extra Text: (seg3) Hash chain length 92.2 avg, 360 max, using 49791 of 1048576 buckets.
-> Seq Scan on item (cost=0.00..435.28 rows=25500 width=4) (actual time=13.391..63.298 rows=25692 loops=1)
-> Hash (cost=9309.07..9309.07 rows=1 width=24) (actual time=266564.791..266564.795 rows=4590000 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 299550kB
-> Broadcast Motion 4:4 (slice3; segments: 4) (cost=0.00..9309.07 rows=1 width=24) (actual time=219859.695..264574.467 rows=4590000 loops=1)
-> Hash Join (cost=0.00..9309.07 rows=1 width=24) (actual time=224132.763..262579.814 rows=1530000 loops=1)
Hash Cond: (date_dim.d_date_sk = inventory.inv_date_sk)
Extra Text: (seg0) Initial batch 0:
-> Seq Scan on date_dim (cost=0.00..433.71 rows=27 width=8) (actual time=20.864..32.408 rows=17 loops=1)
Filter: ((d_year = 1999) AND ((d_moy = 4) OR (d_moy = 5)))
-> Hash (cost=8875.36..8875.36 rows=1 width=24) (actual time=219839.658..219839.662 rows=133110000 loops=1)
Buckets: 2097152 (originally 1048576) Batches: 32 (originally 1) Memory Usage: 507780kB
-> Broadcast Motion 4:4 (slice4; segments: 4) (cost=0.00..8875.36 rows=1 width=24)
.179315.333 rows=133110000 loops=1)
-> Hash Join (cost=0.00..8875.36 rows=1 width=24) (actual time=40.545..164455.
ops=1)
Hash Cond: (inventory.inv_warehouse_sk = warehouse.w_warehouse_sk)
Extra Text: (seg3) Hash chain length 1.0 avg, 1 max, using 10 of 2097152 buckets.
-> Seq Scan on inventory (cost=0.00..1346.13 rows=33277500 width=16) (ac
4895.814 rows=33282571 loops=1)
-> Hash (cost=431.00..431.00 rows=1 width=12) (actual time=0.057..0.061 rows=10 loops=1)
Buckets: 2097152 Batches: 1 Memory Usage: 16385kB
-> Broadcast Motion 4:4 (slice5; segments: 4) (cost=0.00..431.00
tual time=0.020..0.029 rows=10 loops=1)
-> Seq Scan on warehouse (cost=0.00..431.00 rows=1 width=12)
.6.642 rows=4 loops=1)
-> Hash Join (cost=0.00..862.00 rows=1 width=56) (actual time=12.980..21.686 rows=438 loops=1)
Hash Cond: ((share0_ref3.i_item_sk = share0_ref2.i_item_sk) AND (share0_ref3.w_warehouse_sk = share0_ref2.w_warehouse_sk))
Extra Text: (seg0) Hash chain length 1.0 avg, 2 max, using 8836 of 1048576 buckets.
-> Redistribute Motion 4:4 (slice6; segments: 4) (cost=0.00..431.00 rows=1 width=28) (actual time=0.015..3.750 rows=6197 loops=1)
Hash Key: share0_ref3.i_item_sk, share0_ref3.w_warehouse_sk
-> Result (cost=0.00..431.00 rows=1 width=28) (actual time=271401.081..271407.101 rows=6253 loops=1)
Filter: (share0_ref3.d_moy = 4)
-> Shared Scan (share slice:id 6:0) (cost=0.00..431.00 rows=1 width=28) (actual time=271401.067..271404.665 rows=15195 loops=1)
-> Hash (cost=431.00..431.00 rows=1 width=28) (actual time=498.891..498.894 rows=8956 loops=1)
Buckets: 1048576 Batches: 1 Memory Usage: 8756kB
-> Redistribute Motion 4:4 (slice7; segments: 4) (cost=0.00..431.00 rows=1 width=28) (actual time=0.035..495.240 rows=8956 loops=1)
Hash Key: share0_ref2.i_item_sk, share0_ref2.w_warehouse_sk
-> Result (cost=0.00..431.00 rows=1 width=28) (actual time=271381.034..271387.405 rows=8993 loops=1)
Filter: (share0_ref2.d_moy = 5)
-> Shared Scan (share slice:id 7:0) (cost=0.00..431.00 rows=1 width=28) (actual time=271395.678..271398.164 rows=15195 loops=1)
Planning Time: 230.080 ms
(slice0) Executor memory: 741K bytes.
(slice1) Executor memory: 126218K bytes avg x 4x(0) workers, 126360K bytes max (seg3). Work_mem: 68275K bytes max.
(slice2) Executor memory: 309091K bytes avg x 4x(0) workers, 309091K bytes max (seg0). Work_mem: 299550K bytes max.
* (slice3) Executor memory: 554838K bytes avg x 4x(0) workers, 554893K bytes max (seg3). Work_mem: 507780K bytes max, 8257772K bytes wanted.
(slice4) Executor memory: 16467K bytes avg x 4x(0) workers, 16467K bytes max (seg0). Work_mem: 16385K bytes max.
(slice5) Executor memory: 27K bytes avg x 4x(0) workers, 27K bytes max (seg0).
(slice6) Executor memory: 80K bytes avg x 4x(0) workers, 80K bytes max (seg0).
(slice7) Executor memory: 80K bytes avg x 4x(0) workers, 80K bytes max (seg0).
Memory used: 2097152kB
Memory wanted: 74321941kB
Optimizer: Pivotal Optimizer (GPORCA)
Execution Time: 272064.651 ms
It uses broadcast motion instead of hash redistributed motion for the join result of warehouse and inventory as it estimates that there are only 1 row for the result. For the join condition 'inventory.inv_warehouse_sk = warehouse.w_warehouse_sk' , orca doesn't know the foreign key info , so it gives out a wrong estimation.
What you think should happen instead
No response
How to reproduce
psql -U gpadmin -v ON_ERROR_STOP=1 -A -q -t -P pager=off -v EXPLAIN_ANALYZE=" " -f /home/gpadmin/workspace/TPC-DS-HashData/05_sql/139.dsbench.39.sql
Operating System
centos7
Anything else
No response
Are you willing to submit PR?
- [ ] Yes, I am willing to submit a PR!
Code of Conduct
- [X] I agree to follow this project's Code of Conduct.