jitsu icon indicating copy to clipboard operation
jitsu copied to clipboard

Alter distributed tables instead of recreation

Open amadrizwan opened this issue 3 years ago • 3 comments

Problem

When a new column is added, it is added in clickhouse real table using alter table, but for a clickhouse cluster, distributed tables are deleted and created again based on real tables. This creates an issue when dist_ tables are not available for the time they are being recreated. https://github.com/jitsucom/jitsu/blob/master/server/adapters/clickhouse.go line:409

Solution

We can use the same ALTER TABLE on CLUSTER for dist_ tables as well as we do with real tables rather than recreating them.

I tested it with clickhouse version 21.11 and it works fine

CREATE TABLE test ON CLUSTER cluster_1
(
_timestamp DateTime,
id UInt64,
event_id String
)
ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/{database}/{table}', '{replica}')
PARTITION BY toYYYYMM(_timestamp)
ORDER BY event_id;


CREATE TABLE IF NOT EXISTS  dist_test on cluster cluster_1 as test ENGINE = Distributed(cluster_1, default, test, rand());

 alter table test ON CLUSTER test MODIFY  COLUMN  id UInt16
 alter table dist_test  ON CLUSTER test MODIFY  COLUMN  id UInt16

amadrizwan avatar Feb 22 '22 12:02 amadrizwan

Hi Vladimir,

Did you get a change to review this?

amadrizwan avatar Mar 14 '22 08:03 amadrizwan

@amadrizwan Hi! The solution is in beta branch & Docker image. Please take a look.

jfk9w avatar Mar 14 '22 15:03 jfk9w

@jfk9w tested in a setup with already created tables. I started getting errors like this one.

022-03-16 16:09:41 [ERROR]: [clickhouse] Error storing table events from file /.../.../logs/events/incoming/incoming.tok=xxx.log: report.sql.patch_table: failed to patch table {db_info: database=xxx, cluster=xxx, table=events, statement=ALTER TABLE "xxx"."events"  ON CLUSTER "xxx"  }, cause: Code: 62, Message: Syntax error: failed at position 67 (end of query): . Expected one of: a list of ALTER commands, ALTER command, ADD COLUMN, RENAME COLUMN, MATERIALIZE COLUMN, DROP PARTITION, DROP PART, DROP DETACHED PARTITION, DROP DETACHED PART, DROP COLUMN, CLEAR COLUMN, ADD INDEX, DROP INDEX, CLEAR INDEX, MATERIALIZE INDEX, ADD PROJECTION, DROP PROJECTION, CLEAR PROJECTION, MATERIALIZE PROJECTION, MOVE PART, MOVE PARTITION, ADD CONSTRAINT, DROP CONSTRAINT, DETACH PARTITION, DETACH PART, ATTACH PARTITION, REPLACE PARTITION, ATTACH PART, FETCH PARTITION, FETCH PART, FREEZE, UNFREEZE, MODIFY COLUMN, ALTER COLUMN, MODIFY ORDER BY, MODIFY SAMPLE BY, REMOVE SAMPLE BY, DELETE, UPDATE, COMMENT COLUMN, MODIFY TTL, REMOVE TTL, MATERIALIZE TTL, MODIFY SETTING, RESET SETTING, MODIFY QUERY, MODIFY COMMENT

Seems like an incomplete ALTER TABLE command is generated.

amadrizwan avatar Mar 16 '22 16:03 amadrizwan