cloudberry icon indicating copy to clipboard operation
cloudberry copied to clipboard

[Bug] ORCA bugs for pruning volatile functions by mistake

Open my-ship-it opened this issue 7 months ago • 0 comments

Apache Cloudberry version

main

What happened

create table t1(v1 int, v2 int, v3 int);
insert into t1 values(generate_series(1, 10), generate_series(11, 20), generate_series(21, 30));
analyze t1;

create table t2(v1 int, v2 int, v3 int);
insert into t2 values(generate_series(0, 100), generate_series(100, 200), generate_series(200, 300));

postgres=# explain verbose select c11.v1 from (select v1, v2, random() v3 from t1) as c11 left join (select v1, v2, v3 from t1) as c22 on c11.v1=c22.v1 where c11.v1 < 5
;
                                    QUERY PLAN
----------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..862.00 rows=5 width=4)
   Output: t1.v1
   ->  Hash Left Join  (cost=0.00..862.00 rows=2 width=4)
         Output: t1.v1
         Hash Cond: (t1.v1 = t1_1.v1)
         ->  Seq Scan on public.t1  (cost=0.00..431.00 rows=2 width=4)
               Output: t1.v1
               Filter: (t1.v1 < 5)
         ->  Hash  (cost=431.00..431.00 rows=2 width=4)
               Output: t1_1.v1
               ->  Seq Scan on public.t1 t1_1  (cost=0.00..431.00 rows=2 width=4)
                     Output: t1_1.v1
                     Filter: (t1_1.v1 < 5)
 Settings: optimizer = 'on'
 Optimizer: GPORCA
(15 rows)

What you think should happen instead

The plan produced by planner of PG which is correct:

postgres=# explain verbose select c11.v1 from (select v1, v2, random() v3 from t1) as c11 left join (select v1, v2, v3 from t1) as c22 on c11.v1=c22.v1 where c11.v1 < 5
;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=1.07..2.17 rows=3 width=4)
   Output: c11.v1
   ->  Hash Right Join  (cost=1.07..2.12 rows=1 width=4)
         Output: c11.v1
         Hash Cond: (t1.v1 = c11.v1)
         ->  Seq Scan on public.t1  (cost=0.00..1.03 rows=3 width=4)
               Output: t1.v1, t1.v2, t1.v3
         ->  Hash  (cost=1.06..1.06 rows=1 width=4)
               Output: c11.v1
               ->  Subquery Scan on c11  (cost=0.00..1.06 rows=1 width=4)
                     Output: c11.v1
                     ->  Seq Scan on public.t1 t1_1  (cost=0.00..1.04 rows=1 width=16)
                           Output: t1_1.v1, NULL::integer, random()
                           Filter: (t1_1.v1 < 5)
 Settings: optimizer = 'off'
 Optimizer: Postgres query optimizer
(16 rows)

How to reproduce

As above

Operating System

CentOs

Anything else

No response

Are you willing to submit PR?

  • [ ] Yes, I am willing to submit a PR!

Code of Conduct

my-ship-it avatar May 29 '25 05:05 my-ship-it