clickhouse-backup allow option SYSTEM DROP REPLICA ... FROM ZKPATH ... during restore if replica already exists, currently it change path to default_replica_path, default_replica

Jun 06 '25 15:06 Slach

@Slach , is there a workaround for this?

Aug 06 '25 14:08 chipzzz

actually if you face with this issue it means something wrong with your zookeeper path

@chipzzz provide more context which you trying to do

Aug 06 '25 14:08 Slach

@Slach , I've restored several times to the same cluster, it works great!

I now want to restore to a new CHcluster with a new name. My original plan was to keep the same zookeeper and keep the same zookeeper paths(i'm overwriting path even though CH has new name). I'm running into the error where it won't restore table as the replica already exists.

I just actually realized that i don't have the <cluster> in my table paths as originally thought it's simply /clickhouse/tables/transactions/{shard}/default/transactions_local because I have 3 shards 1 replica each. So I don' think I need to actually overwrite the path.(may be my mistake)

Using mostly default backup config which worked fine for same cluster restore.

I think I just realized that i'm making the mistake of overwriting zk path even though I don't need to as it doesn't even have within so the path would be the same anyway.

I'm wondering if instead of using the same existing zookeeper I just deploy a new ZK along with the new CH cluster and restore normally - might be easier.

This through me off in the config, I thought I had always had cluster in the path - i'm not sure if the path changed for me after a couple restores or I never had cluster within path. Anyway, with this default set, restores used to work fine to the same cluster - only difference is i'm trying to restore a new cluster.

      default_replica_path: /clickhouse/tables/{cluster}/{shard}/{database}/{table}
      default_replica_name: '{replica}'

   general:
      remote_storage: s3
      max_file_size: 0
      backups_to_keep_local: 0
      backups_to_keep_remote: 5
      log_level: warning
      allow_empty_backups: true
      download_concurrency: 18
      upload_concurrency: 4
      upload_max_bytes_per_second: 0
      download_max_bytes_per_second: 0
      object_disk_server_side_copy_concurrency: 32
      allow_object_disk_streaming: false
      use_resumable_state: true
      restore_schema_on_cluster: ""
      upload_by_part: true
      download_by_part: true
      restore_database_mapping: {}
      restore_table_mapping: {}
      retries_on_failure: 3
      retries_pause: 5s
      watch_interval: 1h
      full_interval: 24h
      watch_backup_name_template: shard{shard}-{type}-{time:20060102150405}
      sharded_operation_mode: ""
      cpu_nice_priority: 15
      io_nice_priority: idle
      rbac_backup_always: true
      rbac_conflict_resolution: recreate
      retriesduration: 5s
      watchduration: 1h0m0s
      fullduration: 24h0m0s
    clickhouse:
      username: default
      password: ""
      host: localhost
      port: 9000
      disk_mapping: {}
      skip_tables:
          - system.*
          - INFORMATION_SCHEMA.*
          - information_schema.*
          - _temporary_and_external_tables.*
      skip_table_engines: []
      timeout: 30m
      freeze_by_part: false
      freeze_by_part_where: ""
      use_embedded_backup_restore: false
      embedded_backup_disk: ""
      backup_mutations: true
      restore_as_attach: false
      check_parts_columns: true
      secure: false
      skip_verify: false
      sync_replicated_tables: true
      log_sql_queries: true
      restart_command: exec:systemctl restart clickhouse-server
      ignore_not_exists_error_during_freeze: true
      check_replicas_before_attach: true
      default_replica_path: /clickhouse/tables/{cluster}/{shard}/{database}/{table}
      default_replica_name: '{replica}'
      tls_key: ""
      tls_cert: ""
      tls_ca: ""
      max_connections: 18
    s3:
      assume_role_arn: ""
      force_path_style: true
      path: backup/shard-0
      object_disk_path: ""
      disable_ssl: false
      compression_level: 1
      compression_format: tar
      sse: ""
      sse_kms_key_id: ""
      sse_customer_algorithm: ""
      sse_customer_key: ""
      sse_customer_key_md5: ""
      sse_kms_encryption_context: ""
      disable_cert_verification: false
      use_custom_storage_class: false
      storage_class: STANDARD
      custom_storage_class_map: {}
      concurrency: 19
      part_size: 0
      max_parts_count: 4000
      allow_multipart_download: false
      object_labels: {}
      request_payer: ""
      check_sum_algorithm: ""
      debug: false
    api:
      listen: 0.0.0.0:7171
      enable_metrics: true
      enable_pprof: false
      username: ""
      password: ""
      secure: false
      certificate_file: ""
      private_key_file: ""
      ca_cert_file: ""
      ca_key_file: ""
      create_integration_tables: true
      integration_tables_host: ""
      allow_parallel: false
      complete_resumable_after_restart: true
      watch_is_main_process: false
    custom:
      upload_command: ""
      download_command: ""
      list_command: ""
      delete_command: ""
      command_timeout: 4h
      commandtimeoutduration: 4h0m0s

my setup

            <clickhouse_remote_servers>
                  <sentry-clickhouse>
                      <shard>
                          <internal_replication>true</internal_replication>
                          <replica>
                              <host>host-0.svc</host>
                              <port>9000</port>
                              <user>default</user>
                          </replica>
                      </shard>
                      <shard>
                          <internal_replication>true</internal_replication>
                          <replica>
                              <host>host-1.svc</host>
                              <port>9000</port>
                              <user>default</user>
                          </replica>
                      </shard>
                      <shard>
                          <internal_replication>true</internal_replication>
                          <replica>
                              <host>host-2.svc</host>
                              <port>9000</port>
                              <user>default</user>
                          </replica>
                      </shard>
                  </sentry-clickhouse>
              </clickhouse_remote_servers>

this config along with default macros, when I was restoring to same cluster - worked fine.

              <macros>
                  <replica from_env="HOSTNAME"></replica>
                  <shard from_env="SHARD"></shard>
              </macros>

my attempt was to overwrite zk path , which perhaps not needed i'm realizing

CH container overwrite

'/bin/bash', '-c', 'export SHARD=${HOSTNAME##*-} && export REPLICA_NAME=$(echo $HOSTNAME | sed "s/-new-/-/") && /entrypoint.sh'

then macros - keeping the original REPLICA_NAME the same fromm the old cluster name

                <replica from_env="REPLICA_NAME"></replica>
                   <shard from_env="SHARD"></shard>

also used, when restoring to new cluster- not sure if needed , kept this empty when restoring to same cluster. restore_schema_on_cluster: "host-new"

Aug 06 '25 14:08 chipzzz

Yep- just did it without overwriting my zk paths and everything worked as expected. My confusion was that I thought I needed to overwrite my zk paths to keep the same name as the source cluster name. However, I didn't realize my paths are not cluster name specific i.e

/clickhouse/tables/transactions/{shard}/default/transactions_local

No <cluster> in the path.

Again, I thought I did have <cluster> in my paths before. That may be my mistake or after restores to the same cluster, perhaps the paths have changed.

Aug 06 '25 15:08 chipzzz

My original plan was to keep the same zookeeper and keep the same zookeeper paths wrong approach

different clusters need to have different <zookeeper><root> look details https://clickhouse.com/docs/operations/server-configuration-parameters/settings#zookeeper

Aug 06 '25 16:08 Slach

actually "clusters" from <remote_servers> know nothing about replication only <internal_replication>true</internal_replication> which means when you try to insert into engine=Distirbuted then insert into destination underlying tables will executed only for one replica and if the underlying table is replicated then lag controlled with max_replica_delay_for_distributed_queries

Aug 06 '25 16:08 Slach

{cluster} macros defined in clickhouse-operator, but looks like you use some custom clickhouse installation with helm chart?

Aug 06 '25 16:08 Slach

I'm wondering if instead of using the same existing zookeeper I just deploy a new ZK along with the new CH cluster and restore normally - might be easier.

Yep, this is the safest way, but if you don't have resources you can override

<zookeeper><root> via /etc/clickhouse-server/config.d/

Aug 06 '25 16:08 Slach

Yes using helm chart.. and my data is not replicated but it's distributed through distributed tables. So when data hits a distributed table it distributes the data evenly across my shards. I see the same happening(at least from what I gathered) when restoring data using clickhouse restore, as my backup contains both _local _dist tables and the _dist tables contain metadata as to which shard the data should go on so when data is restored is still hits the correct shard rather than going all on 1 shard.

Currently I have 3 shards 1 replica each with reads and writes evenly distributed through the distributed tables interfaces ref: https://clickhouse.com/docs/architecture/horizontal-scaling#creating-a-table

Aug 07 '25 19:08 chipzzz

data is not replicated but it's distributed through distributed table

Usually this schema design makes no sense. In any case, the second distributed table shall have an underlying destination table

clickhouse-backup doesn't backup/restore temporary .bin files inside distributed tables. In your case, it will backup/restore only data from underlying destination table from second distributed

the _dist tables contain metadata as to which shard the data should go on so when data is restored

Actually, it contains only sharding key expression. And the real destination shard will be calculated dynamically according to your <remote_servers> config section on every insert query into distributed table. Then .bin temporary files will be INSERTed into the destination table as usual with the client insert statement in the native format.

Aug 08 '25 03:08 Slach

Thanks for clarifying , yes I want to get to at least 2 replicas . This is for sentry and that's their default setup . Upgrades and expansions not easy due to sentry's migration script not working well when you're sharded and replicated .

Aug 08 '25 17:08 seborys40

allow option SYSTEM DROP REPLICA ... FROM ZKPATH ... during restore if replica already exists, currently it change path to default_replica_path, default_replica_name