citus icon indicating copy to clipboard operation
citus copied to clipboard

INSERT ... SELECT ... between schema based sharding tables incorrectly without erroring out

Open buriedpot opened this issue 2 months ago • 4 comments

Environment

PG 16 + Citus 12.1.5

Prepare

CREATE SCHEMA my_test;
SELECT citus_schema_distribute('my_test');
SET search_path TO my_test;
CREATE TABLE t1(id bigserial PRIMARY KEY, name text);
CREATE TABLE t2(id bigserial PRIMARY KEY, name text);

Error

=> EXPLAIN INSERT INTO my_test.t1(name) SELECT name FROM (SELECT name, max(id) FROM my_test.t2 GROUP BY name) AS sub;
                                     QUERY PLAN                                      
-----------------------------------------------
 Custom Scan (Citus INSERT ... SELECT)  (cost=0.00..0.00 rows=0 width=0)
   INSERT/SELECT method: pull to coordinator
   ->  Custom Scan (Citus Adaptive)  (cost=0.00..250.00 rows=100000 width=40)
         Task Count: 0
         Tasks Shown: All
(5 rows)

Here, Task Count: 0 is not expected.

Analysis

The reason may be related to the following code. It doesn't handle schema based sharding tables and doesn't error out.

multi_physical_planner.c:2202

	forboth_ptr(prunedShardList, prunedRelationShardList,
				relationRestriction, relationRestrictionContext->relationRestrictionList)
	{
		Oid relationId = relationRestriction->relationId;

		CitusTableCacheEntry *cacheEntry = GetCitusTableCacheEntry(relationId);
		if (!HasDistributionKeyCacheEntry(cacheEntry))
		{
			continue;
		}

Can you review this issue

buriedpot avatar Oct 13 '25 03:10 buriedpot

Hey @buriedpot,

Why do you think this is not supposed to be supported? I think it's safe to support insert .. select between two tables within the same distributed schema.

onurctirtir avatar Oct 14 '25 07:10 onurctirtir

Hey @buriedpot,

Why do you think this is not supposed to be supported? I think it's safe to support insert .. select between two tables within the same distributed schema.

Hi @onurctirtir, My concern is not that it shouldn't be supported, but that the current plan shows Task Count: 0, which means no actual work is sent to workers. If both tables are distributed and co-located, Citus should generate tasks to execute the INSERT...SELECT on shards. If they are not co-located, it should fall back to pull-to-coordinator or error out. But Task Count: 0 suggests a planning bug — the query appears to succeed but does nothing. Could you check if data is actually inserted?

buriedpot avatar Oct 14 '25 07:10 buriedpot

Hi @onurctirtir , The following issues may be related to this one:

https://github.com/citusdata/citus/issues/7783 https://github.com/citusdata/citus/issues/7698

buriedpot avatar Oct 15 '25 08:10 buriedpot

This looks like a bug in INSERT .. SELECT planning, specifically for schema-sharded and reference tables when the SELECT has a GROUP BY. If GROUP BY is removed, the query is planned correctly:

EXPLAIN (verbose, costs off) INSERT INTO t1 (name) SELECT name FROM (select name from t2)  sub;
┌───────────────────────────────────────────────────────────────────────┐
│                              QUERY PLAN                               │
├───────────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus INSERT ... SELECT)                                 │
│   INSERT/SELECT method: pull to coordinator                           │
│   ->  Custom Scan (Citus Adaptive)                                    │
│         Output: nextval('t1_id_seq'::regclass), remote_scan.name      │
│         Task Count: 1                                                 │
│         Tasks Shown: All                                              │
│         ->  Task                                                      │
│               Query: SELECT name FROM my_test.t2_102527 t2 WHERE true │
│               Node: host=localhost port=9701 dbname=citus             │
│               ->  Seq Scan on my_test.t2_102527 t2                    │
│                     Output: name                                      │
│                   Query Identifier: -5705421328430345798              │
│ Query Identifier: 4524145314481174201                                 │
└───────────────────────────────────────────────────────────────────────┘

But with a GROUP BY name (or DISTINCT name) in the SELECT, no tasks are produced and no rows are inserted:

EXPLAIN (verbose, costs off) INSERT INTO t1 (name) SELECT name FROM (select name from t2 group by name)  sub;
┌──────────────────────────────────────────────────────────────────┐
│                            QUERY PLAN                            │
├──────────────────────────────────────────────────────────────────┤
│ Custom Scan (Citus INSERT ... SELECT)                            │
│   INSERT/SELECT method: pull to coordinator                      │
│   ->  Custom Scan (Citus Adaptive)                               │
│         Output: nextval('t1_id_seq'::regclass), remote_scan.name │
│         Task Count: 0                                            │
│         Tasks Shown: All                                         │
│ Query Identifier: 27168639322615334                              │
└──────────────────────────────────────────────────────────────────┘

The same problem is present if t1 and t2 are reference tables. So its not restricted to schema-based sharding, but appears related to tables that resolve to one shard.

colm-mchugh avatar Oct 15 '25 13:10 colm-mchugh