clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

Automatic recovery after complete data loss

Open R-omk opened this issue 2 years ago • 8 comments

The algorithm for manual recovery has already been described here https://kb.altinity.com/altinity-kb-setup-and-maintenance/recovery-after-complete-data-loss/

Expected behavior

  • delete pvc and delete pod/statfulset (or any other or any other way to delete all data from volume )
  • after new pod starts, the recovery process will immediately begin

R-omk avatar Jul 05 '22 14:07 R-omk

At the moment I see that the operator creates databases (atomic engine too) and even distributed tables, but not ReplicatedMergeTree tables:

<Error> executeQuery: Code: 253. DB::Exception: Replica /clickhouse/tables/8b70dd70-8114-4237-8bdf-120fceb06ed0/shard0/replicas/chi-XXXX-s0r0 already exists. (REPLICA_IS_ALREADY_EXIST) (version 22.3.6.5 (official build))

Also the operator does not restore the statefulset https://github.com/Altinity/clickhouse-operator/issues/970

R-omk avatar Jul 05 '22 15:07 R-omk

related https://github.com/Altinity/clickhouse-operator/issues/857

R-omk avatar Jul 05 '22 16:07 R-omk

Seeing the same issue on my end. Database is created on the new instance but the table (ReplicatedMergeTree) is not, and the logs contain the same DB::Exception REPLICA_IS_ALREADY_EXIST

mlucic avatar Jul 12 '22 20:07 mlucic

  1. I created ClickHouseInstallation with 1 shard, 3 replicas, pointing to 3 ClickHouse Keeper nodes
  2. I created the database and table as follows:

` CREATE DATABASE testdb ON CLUSTER '{cluster}'

CREATE TABLE IF NOT EXISTS testdb.testtable ON CLUSTER '{cluster}' ( id UUID, timestamp DateTime ) ENGINE = ReplicatedMergeTree('/clickhouse/tables/{shard}/{database}/testtable', '{replica}') PARTITION BY toYYYYMM(timestamp) ORDER BY (id, timestamp); `

  1. I reduced the replicas on the ClickHouseInstallation to 2
  2. After the third replica was removed I increased replicas back to 3

I expected that the new third replica would successfully have created the database and table, however only the database was created successfully. The table failed to create due to the REPLICA_IS_ALREADY_EXIST error.

Here is the complete log from the new third replica: chi-di-clickhouse-installation-replicated-0-2-0.log

mlucic avatar Jul 13 '22 19:07 mlucic

@mlucic which clickhouse-operator version do you use?

Slach avatar Jul 14 '22 06:07 Slach

@mlucic which clickhouse-operator version do you use?

0.18.5

mlucic avatar Jul 14 '22 14:07 mlucic

REPLICA_IS_ALREADY_EXIST

operator version 0.19.0

R-omk avatar Jul 15 '22 12:07 R-omk

REPLICA_IS_ALREADY_EXIST

operator version 0.19.0

I just upgraded to 0.19.0, can confirm that this issue is still present

mlucic avatar Jul 15 '22 17:07 mlucic

@R-omk, restore process after loosing PV or PVC is fully implemented in 0.23.x

alex-zaitsev avatar Feb 14 '24 16:02 alex-zaitsev

@R-omk, restore process after loosing PV or PVC is fully implemented in 0.23.x

operator (0.23.3)
-cannot detect sts remove and recreate
-cannot detect sts wrong scale (zero ) and restore scale

R-omk avatar Mar 19 '24 14:03 R-omk

The conditions under which replica data is restored are maximally opaque. I believe that the operator should check the need for recovery every time a pod belonging to the cluster is created.

R-omk avatar Mar 21 '24 13:03 R-omk