Deleting a CHI resource may leave debris from replicated tables in [Zoo]Keeper that requires later cleanup
If you delete a CHI cluster that has replicated tables, [Zoo]Keeper metadata is not cleaned up when the replica(s) are deleted. This leads to errors like the following if you then re-create the CHI resource and try to add tables again:
Received exception from server (version 23.7.4):
Code: 253. DB::Exception: Received from localhost:9000. DB::Exception: There was an error on [chi-demo2-s3-0-0:9000]: Code: 253. DB::Exception: Replica /clickhouse/s3/tables/0/default/test_local/replicas/chi-demo2-s3-0-0 already exists. (REPLICA_ALREADY_EXISTS) (version 23.7.4.5 (official build)). (REPLICA_ALREADY_EXISTS)
(query: CREATE TABLE IF NOT EXISTS test_local ON CLUSTER `{cluster}`
To duplicate this problem, follow the steps shown below.
- Create ClickHouse CHI using
kubectl apply -fwith proper connection to Keeper. - Create at least one replica table. (See the example DDL below.)
- Delete the CHI using
kubectl delete chi/<name>.
Now try to duplicate steps 1 and 2 again. This will fail due to existing ZooKeeper metadata.
The workaround is to remote the replica paths using SYSTEM DROP REPLICA. It's painful if there are many tables. Example:
SYSTEM DROP REPLICA 'chi-demo2-s3-0-0' FROM ZKPATH '/clickhouse/s3/tables/0/default/test_local'
Use the following DDL for step 2 above.
CREATE TABLE IF NOT EXISTS test_local ON CLUSTER `{cluster}`
(
`A` Int64,
`S` String,
`D` Date
)
ENGINE = ReplicatedMergeTree('/clickhouse/{cluster}/tables/{shard}/{database}/test_local', '{replica}')
PARTITION BY D ORDER BY A;
Note: this problem does not arise if you scale replicas down. In that case the operator properly deletes replica tables which ensures ZooKeeper cleanup.
This is weird, operator deletes all replicated tables when deleting CHI. There s a regression test for that.
It could be related to https://github.com/Altinity/clickhouse-operator/issues/1388 -- DROP TABLE may return fast, but ClickHouse may keep deleting data in the background. Probably SYNC may help here
Fixed in 0.24.0
Released in https://github.com/Altinity/clickhouse-operator/releases/tag/release-0.23.7