citus icon indicating copy to clipboard operation
citus copied to clipboard

"LIMIT subquery + intermediate results" gets a `WARNING` and returns wrong result.

Open duerwuyi opened this issue 1 month ago • 0 comments

citus version:

SELECT citus_version();
---Citus 13.2.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit

PG version: 17.6

query:

select  
  (subq_0.c_1) <> (subq_0.c_2) as c_0
from 
  ((select  
          ref_0.c18 as c_0, 
          (select pg_catalog.variance(vkey) from t3)
             as c_1, 
          ref_0.c16 as c_2
        from 
          t2 as ref_0
        where true::bool
        order by c_0 desc, c_1 asc, c_2 desc) as subq_0
    cross join t29 as ref_1
    )
where (((null::bool) in (select  
            null::bool as c_0
          from 
            t9 as ref_2
          where null::bool
        union
        select  
            null::bool as c_0
          from 
            t15 as ref_3
          where false::bool)) 
    or ((ref_1.c31) <= (subq_0.c_1))) 
  or (((select c32 from t29 order by c32 limit 1 offset 2)
      ) < ( 
    select  
        ref_1.c31 as c_0
      from 
        t9 as ref_4
      where false::bool
      order by c_0 desc
       limit 1))
order by c_0 desc

will return 0 rows . But the same query returns 1404 rows in PG. This query also returns a warning : WARNING: Query could not find the intermediate result file "1_3", it was mostly likely deleted due to an error in a parallel process within the same distributed transaction.

query plan:

Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
  ->  Distributed Subplan 1_1
        ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
              Task Count: 1
              Tasks Shown: All
              ->  Task
                    Node: host=localhost port=5432 dbname=testdb
                    ->  Unique  (cost=0.04..0.05 rows=2 width=1)
                          ->  Sort  (cost=0.04..0.05 rows=2 width=1)
                                Sort Key: (NULL::boolean)
                                ->  Append  (cost=0.00..0.03 rows=2 width=1)
                                      ->  Result  (cost=0.00..0.01 rows=1 width=1)
                                            One-Time Filter: (false AND NULL::boolean)
                                      ->  Result  (cost=0.00..0.01 rows=1 width=1)
                                            One-Time Filter: (false AND false)
  ->  Distributed Subplan 1_2
        ->  Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=0 width=0)
              Task Count: 1
              Tasks Shown: All
              ->  Task
                    Node: host=localhost port=5432 dbname=testdb
                    ->  Limit  (cost=2.54..2.54 rows=1 width=9)
                          ->  Sort  (cost=2.54..2.70 rows=67 width=9)
                                Sort Key: c32
                                ->  Seq Scan on t29_102277 t29  (cost=0.00..1.67 rows=67 width=9)
  ->  Distributed Subplan 1_3
        ->  Aggregate  (cost=1.72..1.73 rows=1 width=32)
              ->  Seq Scan on t3  (cost=0.00..1.57 rows=57 width=4)
  ->  Distributed Subplan 1_4
        ->  Sort  (cost=11.56..11.60 rows=18 width=96)
              Sort Key: ref_0.c18 DESC, ref_0.c16 DESC
              InitPlan 1
                ->  Function Scan on read_intermediate_result intermediate_result  (cost=0.00..10.00 rows=1000 width=32)
              ->  Seq Scan on t2 ref_0  (cost=0.00..1.18 rows=18 width=96)
  Task Count: 1
  Tasks Shown: All
  ->  Task
        Node: host=localhost port=5432 dbname=testdb
        ->  Sort  (cost=424827.34..424957.61 rows=52111 width=1)
              Sort Key: ((intermediate_result.c_1 <> intermediate_result.c_2)) DESC
              InitPlan 2
                ->  Function Scan on read_intermediate_result intermediate_result_2  (cost=0.00..10.00 rows=1000 width=32)
              ->  Nested Loop  (cost=0.01..420734.62 rows=52111 width=1)
                    Join Filter: ((ANY NULL::boolean) OR (ref_1.c31 <= intermediate_result.c_1) OR ((InitPlan 2).col1 < (SubPlan 3)))
                    ->  Function Scan on read_intermediate_result intermediate_result  (cost=0.00..10.00 rows=1000 width=64)
                    ->  Materialize  (cost=0.00..2.00 rows=67 width=9)
                          ->  Seq Scan on t29_102277 ref_1  (cost=0.00..1.67 rows=67 width=9)
                    SubPlan 1
                      ->  Function Scan on read_intermediate_result intermediate_result_1  (cost=0.00..10.00 rows=1000 width=1)
                    SubPlan 3
                      ->  Limit  (cost=0.00..0.01 rows=1 width=32)
                            ->  Result  (cost=0.00..0.01 rows=1 width=32)
                                  One-Time Filter: (false AND false)
            JIT:
              Functions: 19
              Options: Inlining false, Optimization false, Expressions true, Deforming true

SCHEMA for repeoduce:

db_setup.sql

duerwuyi avatar Oct 31 '25 13:10 duerwuyi