citus icon indicating copy to clipboard operation
citus copied to clipboard

unable to add node when distribution columns have different collation types

Open mtuncer opened this issue 1 year ago • 2 comments

Description

  • two tables were created in a single node cluster, with text type but different collations ( C versus default)
  • both tables were distributed using create distributed table call
  • noticed they were colocated
  • attempt to add a new node fails
  • alter table alter column command to set the collation is rejected by coordinator due to it is being distribution key

Expected

  • Either created_distributed_table should fail or citus_add_node should succeed

Detailed repro steps

citus=> select version();
-------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 14.13 (Ubuntu 14.13-1.pgdg20.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit 
(1 row) 

citus=> select citus_version();
----------------------------------------------------------------------------------------------------
 Citus 12.1.5 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0, 64-bit 
(1 row) 

citus=> create table t1 ( a text collate "C", b text collate "C"); 
CREATE TABLE 
citus=> \d t1 
Table "public.t1" 
Column | Type | Collation | Nullable | Default 
--------+------+-----------+----------+--------- 
 a      | text | C         |          | 
 b      | text | C         |          | 
citus=> create table t2( a text, b text); 
CREATE TABLE 
citus=> select create_distributed_table('t1', 'a'); 
create_distributed_table 
-------------------------- 
(1 row)
citus=> select create_distributed_table('t2', 'a');
create_distributed_table 
--------------------------
(1 row)
citus=>

citus=> select * from pg_dist_partition;     
 logicalrelid | partmethod |                                                          partkey                                                           | colocationid | repmodel | autoconverted   
--------------+------------+----------------------------------------------------------------------------------------------------------------------------+--------------+----------+---------------   
 t1           | h          | {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 950 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location -1} |            1 | s        | f        
 t2           | h          | {VAR :varno 1 :varattno 1 :vartype 25 :vartypmod -1 :varcollid 100 :varlevelsup 0 :varnosyn 1 :varattnosyn 1 :location -1} |            1 | s        | f       
(2 rows)  

citus=> select * from pg_collation where oid = 100 or oid=950; 
 oid | collname | collnamespace | collowner | collprovider | collisdeterministic | collencoding | collcollate | collctype | collversion
-----+----------+---------------+-----------+--------------+---------------------+--------------+-------------+-----------+-------------
 100 | default  |            11 |        10 | d            | t                   |           -1 |             |           |
 950 | C        |            11 |        10 | c            | t                   |           -1 | C           | C         |
(2 rows)   

select citus_add_node('<redacted>', 5432);
NOTICE:  shards are still on the coordinator after adding the new node
HINT:  Use SELECT rebalance_table_shards(); to balance shards data between workers and coordinator or SELECT citus_drain_node('<redacted>',5432); to permanently move shards away from the coordinator.
WARNING:  cannot colocate tables t2 and t1 
DETAIL:  Distribution column collations don't match for t2 and t1.  
ERROR:  failure on connection marked as essential: <redacted>:5432

mtuncer avatar Sep 06 '24 09:09 mtuncer

Reproduced in a one node cluster. On a multi-node citus cluster, the second distributed table creation step fails with the following error.

 select create_distributed_table('t2', 'a');
ERROR:  cannot colocate tables t2 and t1
DETAIL:  Distribution column collations don't match for t2 and t1.
CONTEXT:  while executing command on localhost:9802

This happens because colocation fails when the distribution columns use different collations. Ideally, Citus should detect this upfront and avoid attempting to colocate when collations differ (treating them as incompatible, similar to different data types).

eaydingol avatar Sep 24 '25 13:09 eaydingol

Well, we seem to have multiple issues that can cause colocating two tables even though their distribution key collations don't match:

  1. We don't check for a collation match when creating a colocated distributed table via create_distributed_table() or when altering a distributed table in a way to colocate it with another distributed table via alter_distributed_table(), which corresponds to providing "colocate_with >= 'other_table'" for both functions.

  2. We don't care about the collation used for the specified distribution key when creating a distributed table via create_distributed_table() with "colocate_with => 'none'" or "colocate_with => 'default'". Instead, we assume the collation of the column is same as what get_typcollation(distributionColumnType) returns. For;

    • "colocate_with => 'none'", this results in creating a new colocation group with the default collation of type, although the first table in that colocation group now uses a different collation.
    • "colocate_with => 'default'" (default behavior), this results in putting the new distributed table into a colocation group that uses the default collation of type.

    Also, note that create_distributed_table_concurrently() doesn't cause such an issue because it respects the collation specified for the column (distributionColumn->varcollid).

Merging #8257 will be useful to fix both, however, we still need to have way to fix such colocation groups in the clusters created before we fix those issues. A solution to this could be to expand citus_finish_citus_upgrade() to implement the following:

  1. Go over pg_dist_partition / pg_dist_colocation.
  2. For each colocation group that contains some distributed tables whose distribution key collations don't match what was recorded for the colocation group in pg_dist_colocation, move such tables out of their current colocation groups using update_distributed_table_colocation().

If we want to follow that path, we need to make sure to;

  1. Avoid slowing citus_finish_citus_upgrade() down a lot.
  2. Avoid breaking distributed function / procedure call push-down for the tables moved out, if there were any colocated distributed functions / procedures. Seems update_distributed_table_colocation() doesn't do anything about colocated functions when moving a table out of its colocation group, but I'm also not super sure if that may necessarily break anything in practice.

Besides all of these, I'm also wondering if we really need to record the collation used for the distribution keys when colocating distributed tables with each other. In other words, what if we didn't care about collation of distribution keys? Would that really cause incorrect behavior, e.g., when planning a router query or such?

To better understand this, I need to go over the old PRs to better understand why we decided to keep track of collations as part of colocation groups and need to better understand the collations in Postgres in general.

update: See https://github.com/citusdata/citus/commit/be3285828f17c8841d649361edc2aa3dd094f0cb, seems we should really care about dist key collations when colocating distributed tables.

onurctirtir avatar Oct 22 '25 15:10 onurctirtir