citus icon indicating copy to clipboard operation
citus copied to clipboard

master_add_node fails to sync metadata if there is a DOMAIN defined within a schema other than public.

Open emelsimsek opened this issue 3 months ago • 1 comments

Citus 12.1.7/13.1.0 PG16/PG17

Repro:

  1. Create a cluster
  2. Run https://github.com/citusdata/citus/blob/b7bfe42f1a4d22db4b1ecc2636cdf83adf27c106/src/test/regress/sql/prepared_statements_create_load.sql
  3. Add a new node using master_add_node

Note that the command fails with an error when trying to process prepared statements.test_key

NOTICE:  issuing WITH colocation_group_data (colocationid, shardcount, replicationfactor, distributioncolumntype, distributioncolumncollationname, distributioncolumncollationschema)  AS (VALUES (4, 32, 1, '"prepared statements".test_key'::regtype, 'default', 'pg_catalog')) SELECT pg_catalog.citus_internal_add_colocation_metadata(colocationid, shardcount, replicationfactor, distributioncolumntype, coalesce(c.oid, 0)) FROM colocation_group_data d LEFT JOIN pg_collation c ON (d.distributioncolumncollationname = c.collname AND d.distributioncolumncollationschema::regnamespace = c.collnamespace)
DETAIL:  on server emel@localhost:9704 connectionId: 1
WARNING:  schema "prepared statements" does not exist
ERROR:  failure on connection marked as essential: localhost:9704

Note that since this command is run before creating prepared statements schema, it errors out.

emelsimsek avatar Sep 15 '25 10:09 emelsimsek

Seems this is even broken when the domain is in the public schema;

# coordinator - 9700

psql> drop table if exists dist; drop domain if exists positive_int; CREATE DOMAIN positive_int as int check ((VALUE) > 0); CREATE TABLE dist(a positive_int); SELECT create_distributed_table('dist', 'a');
DROP TABLE
Time: 81.094 ms
DROP DOMAIN
Time: 7.860 ms
CREATE DOMAIN
Time: 9.586 ms
CREATE TABLE
Time: 2.201 ms
┌──────────────────────────┐
│ create_distributed_table │
├──────────────────────────┤
│                          │
└──────────────────────────┘
(1 row)

psql> set citus.log_remote_commands to on;
SET

psql> SELECT 1 FROM citus_add_node('localhost', 9702);
...
...
NOTICE:  issuing WITH colocation_group_data (colocationid, shardcount, replicationfactor, distributioncolumntype, distributioncolumncollationname, distributioncolumncollationschema)  AS (VALUES (3, 32, 1, 'public.positive_int'::regtype, NULL, NULL)) SELECT citus_internal.add_colocation_metadata(colocationid, shardcount, replicationfactor, distributioncolumntype, coalesce(c.oid, 0)) FROM colocation_group_data d LEFT JOIN pg_collation c ON (d.distributioncolumncollationname = c.collname AND d.distributioncolumncollationschema::regnamespace = c.collnamespace)
DETAIL:  on server onurctirtir@localhost:9702 connectionId: 2
WARNING:  type "public.positive_int" does not exist
...
...

Workaround is to create the domain on the new worker node before adding it, until we fix this issue;

# new node - 9702
psql> CREATE DOMAIN positive_int as int check ((VALUE) > 0);
CREATE DOMAIN
# coordinator - 9700
psql> SELECT 1 FROM citus_add_node('localhost', 9702);
┌──────────┐
│ ?column? │
├──────────┤
│        1 │
└──────────┘
(1 row)

onurctirtir avatar Nov 13 '25 12:11 onurctirtir