citus
citus copied to clipboard
"LIMIT subquery + intermediate results" gets a `WARNING` and returns wrong result.
citus version:
SELECT citus_version();
---Citus 13.2.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
PG version: 17.6
query:
select
(subq_0.c_1) <> (subq_0.c_2) as c_0
from
((select
ref_0.c18 as c_0,
(select pg_catalog.variance(vkey) from t3)
as c_1,
ref_0.c16 as c_2
from
t2 as ref_0
where true::bool
order by c_0 desc, c_1 asc, c_2 desc) as subq_0
cross join t29 as ref_1
)
where (((null::bool) in (select
null::bool as c_0
from
t9 as ref_2
where null::bool
union
select
null::bool as c_0
from
t15 as ref_3
where false::bool))
or ((ref_1.c31) <= (subq_0.c_1)))
or (((select c32 from t29 order by c32 limit 1 offset 2)
) < (
select
ref_1.c31 as c_0
from
t9 as ref_4
where false::bool
order by c_0 desc
limit 1))
order by c_0 desc
will return 0 rows . But the same query returns 1404 rows in PG.
This query also returns a warning :
WARNING: Query could not find the intermediate result file "1_3", it was mostly likely deleted due to an error in a parallel process within the same distributed transaction.
query plan:
Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0)
-> Distributed Subplan 1_1
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=5432 dbname=testdb
-> Unique (cost=0.04..0.05 rows=2 width=1)
-> Sort (cost=0.04..0.05 rows=2 width=1)
Sort Key: (NULL::boolean)
-> Append (cost=0.00..0.03 rows=2 width=1)
-> Result (cost=0.00..0.01 rows=1 width=1)
One-Time Filter: (false AND NULL::boolean)
-> Result (cost=0.00..0.01 rows=1 width=1)
One-Time Filter: (false AND false)
-> Distributed Subplan 1_2
-> Custom Scan (Citus Adaptive) (cost=0.00..0.00 rows=0 width=0)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=5432 dbname=testdb
-> Limit (cost=2.54..2.54 rows=1 width=9)
-> Sort (cost=2.54..2.70 rows=67 width=9)
Sort Key: c32
-> Seq Scan on t29_102277 t29 (cost=0.00..1.67 rows=67 width=9)
-> Distributed Subplan 1_3
-> Aggregate (cost=1.72..1.73 rows=1 width=32)
-> Seq Scan on t3 (cost=0.00..1.57 rows=57 width=4)
-> Distributed Subplan 1_4
-> Sort (cost=11.56..11.60 rows=18 width=96)
Sort Key: ref_0.c18 DESC, ref_0.c16 DESC
InitPlan 1
-> Function Scan on read_intermediate_result intermediate_result (cost=0.00..10.00 rows=1000 width=32)
-> Seq Scan on t2 ref_0 (cost=0.00..1.18 rows=18 width=96)
Task Count: 1
Tasks Shown: All
-> Task
Node: host=localhost port=5432 dbname=testdb
-> Sort (cost=424827.34..424957.61 rows=52111 width=1)
Sort Key: ((intermediate_result.c_1 <> intermediate_result.c_2)) DESC
InitPlan 2
-> Function Scan on read_intermediate_result intermediate_result_2 (cost=0.00..10.00 rows=1000 width=32)
-> Nested Loop (cost=0.01..420734.62 rows=52111 width=1)
Join Filter: ((ANY NULL::boolean) OR (ref_1.c31 <= intermediate_result.c_1) OR ((InitPlan 2).col1 < (SubPlan 3)))
-> Function Scan on read_intermediate_result intermediate_result (cost=0.00..10.00 rows=1000 width=64)
-> Materialize (cost=0.00..2.00 rows=67 width=9)
-> Seq Scan on t29_102277 ref_1 (cost=0.00..1.67 rows=67 width=9)
SubPlan 1
-> Function Scan on read_intermediate_result intermediate_result_1 (cost=0.00..10.00 rows=1000 width=1)
SubPlan 3
-> Limit (cost=0.00..0.01 rows=1 width=32)
-> Result (cost=0.00..0.01 rows=1 width=32)
One-Time Filter: (false AND false)
JIT:
Functions: 19
Options: Inlining false, Optimization false, Expressions true, Deforming true
SCHEMA for repeoduce: