clickhouse-operator icon indicating copy to clipboard operation
clickhouse-operator copied to clipboard

CHK pods restart all at once when a modification to ClickhouseKeeperInstalattion is made

Open Tchirana opened this issue 4 months ago • 12 comments

operator version 0.25.3 I have this configured in the clickhousekeeperinstalattion: pdbManaged: "True" pdbMaxUnavailable: 1 replicas: 3 The pdb for chk is created ok (the same is for the chi) but when i modify something in the clickhousekeeperinstallation , like the image when doing an upgrade, all the chk pods get restarted in the same time not one by one as it happens for the CHI pods. I've read across docs and cannot find something related to this. both CHK and CHI pods come from individual sts-es as the operator creates them. Am I missing something ?

Tchirana avatar Aug 19 '25 20:08 Tchirana

@Tchirana, PDB is set for Kubernetes, e.g. when Kubernetes decides to reschedule pods or something like that. Operator has different logic for upgrade. Could you you explain in more detail what you were doing?

alex-zaitsev avatar Aug 26 '25 10:08 alex-zaitsev

Hello, just the normal stuff of replacing the image in the chk installation manifest. What is even more weird that we have several clickhouses in different clusters , all of the chi and chk installations use exactly the same manifests, same operator version with same settings, but in every cluster they behave differently. On image change for upgrade in some clusters the pods restart 1 at a time but in otherd 2 at a time or even all of them in the same time. We use 3 pods chk clusters and each pod comes from it's own sts (that is how the operator provisions them)

Tchirana avatar Aug 26 '25 17:08 Tchirana

@alex-zaitsev renamed the issue since you are correct with the PDB. That does take effect because operator sets up 1 pod per sts.

Tchirana avatar Sep 01 '25 17:09 Tchirana

We are seeing a similar issue when updating PVCs and the node selector on the CHK

wilkermichael avatar Sep 09 '25 15:09 wilkermichael

I really think it is a race condition because it makes no sense. On some clusters those chk pods behaved, on others they restarted chaotically, all at once, 2 at a time . And all our clusters are exactly the same related to k8s settings, operators configs, chk configs ...

Tchirana avatar Sep 09 '25 19:09 Tchirana

because of this we usually stop all that puts data in clickhouse, stop chi pods and after that do maint on chk, otherwise there is a very big risk in databases getting in readonly state and that is a very annoying thing to fix

Tchirana avatar Sep 09 '25 19:09 Tchirana

I heave the same problem. Operator version 0.24.5.

When making changes in ClickHouseKeeperInstallation with replicasCount: 3, e.g. updating image version in podTemplates, first replica (0-0) always will be updated first, then 2nd and 3rd replicas terminated at the same time and keeper loses quorum: in console of the first replica echo mntr | nc 127.0.0.1 2181 outputs This instance is not currently serving requests and many errors in clickhouse logs.

chk-click-migration-cluster-0-0-0      1/1     Running   0          2m27s
chk-click-migration-cluster-0-1-0      1/1     Running   0          96s
chk-click-migration-cluster-0-2-0      1/1     Running   0          97s

But this behavior is inconsistent: sometimes 2nd and 3rd replicas updated with some interval without losing quorum.

chk-click-migration-cluster-0-0-0      1/1     Running   0          1m34s
chk-click-migration-cluster-0-1-0      1/1     Running   0          23s
chk-click-migration-cluster-0-2-0      1/1     Running   0          38s
chk-click-migration-cluster-0-0-0      1/1     Running   0          4m4s
chk-click-migration-cluster-0-1-0      1/1     Running   0          3m19s
chk-click-migration-cluster-0-2-0      1/1     Running   0          2m45s

spirkaa avatar Oct 07 '25 13:10 spirkaa

@spirkaa yeah. We fck-ed an entire cluster because of this and it was a nightmare to get databases back from read-only. We were gratefull we have dev environment :)

Tchirana avatar Oct 07 '25 19:10 Tchirana

@alex-zaitsev how can we help tackle this ?

deadlybore avatar Oct 15 '25 15:10 deadlybore

ouch...also noticing this just now. Moved over from ZooKeeper as Keeper was recommended.

@alex-zaitsev any chance someone can look into this? It seems like a critical thing taking down all keeper pods. below is my chop log from when I apply a CHK change til I see the pod terminate (all 3 at the same time)

I1209 15:35:15.277038       1 worker-reconciler-chk.go:49] worker-reconciler-chk.go:49:reconcileCR():start:unknown
I1209 15:35:15.277065       1 worker.go:382] createTemplatedCR():unknown:CR has an ancestor, use it as a base for reconcile. CR: tst01/odl
I1209 15:35:15.278576       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:15.278595       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-0. Cur: 10b55d9005c501b3040a3efe37ba3010f787a9e1 New: 2f935a5ac30131f75b7120c9933399f3ce3e072b
I1209 15:35:15.279454       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:15.279472       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-1. Cur: b88fb7b36e936ac0be53c137765adfc987c77d03 New: 7678c212efeff068ae5b5b656e75de40bd7b7119
I1209 15:35:15.280232       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:15.280247       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-2. Cur: 667b58a6643ecece49f0be8367b3c2036d641684 New: 62b3203937325f3b6f1f28d0a4f989d21bd2deff
I1209 15:35:15.282561       1 worker-reconciler-chk.go:102] unknown:IPs of the CR tst01/odl: len: 3 [100.112.0.19 100.112.3.175 100.112.1.15]
I1209 15:35:15.282588       1 worker.go:382] createTemplatedCR():unknown:CR has an ancestor, use it as a base for reconcile. CR: tst01/odl
I1209 15:35:15.285900       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:15.285924       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-0. Cur: 10b55d9005c501b3040a3efe37ba3010f787a9e1 New: 2f935a5ac30131f75b7120c9933399f3ce3e072b
I1209 15:35:15.286973       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:15.286993       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-1. Cur: b88fb7b36e936ac0be53c137765adfc987c77d03 New: 7678c212efeff068ae5b5b656e75de40bd7b7119
I1209 15:35:15.287722       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:15.287738       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-2. Cur: 667b58a6643ecece49f0be8367b3c2036d641684 New: 62b3203937325f3b6f1f28d0a4f989d21bd2deff
I1209 15:35:15.289097       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-0[0/0]:tst01/odl:Host software version: 0-0 99.99.99[99.99.99/so far so]
I1209 15:35:15.289110       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-1[0/1]:tst01/odl:Host software version: 0-1 99.99.99[99.99.99/so far so]
I1209 15:35:15.289115       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-2[0/2]:tst01/odl:Host software version: 0-2 99.99.99[99.99.99/so far so]
I1209 15:35:15.289121       1 worker-reconciler-chk.go:157] logSWVersion():unknown:CR software versions [min, max]: 99.99.99[99.99.99/] 99.99.99[99.99.99/so far so]
I1209 15:35:15.289568       1 worker-reconciler-chk.go:55] unknown:ActionPlan start buildCR ---------------------------------------------:
Diff start -------------------------
modified spec items num: 1
diff item [0]:'.Templates.PodTemplates[0].Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution[0].LabelSelector.MatchExpressions[0].Values[0]' = '"odlw"'
Diff end -------------------------

ActionPlan end buildCR ---------------------------------------------
I1209 15:35:15.289584       1 worker-reconciler-chk.go:59] reconcileCR():unknown:ActionPlan has actions - continue reconcile
I1209 15:35:15.302289       1 worker.go:190] markReconcileStart():unknown:reconcile started, task id: auto-68c5d60a-45e2-453b-98f6-ce653510538a
I1209 15:35:15.302681       1 worker.go:373] Host:0-0[0/0]:tst01/odl:Host status: modified. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:0|host:0-0
I1209 15:35:15.302698       1 worker.go:373] Host:0-1[0/1]:tst01/odl:Host status: modified. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:1|host:0-1
I1209 15:35:15.302706       1 worker.go:373] Host:0-2[0/2]:tst01/odl:Host status: modified. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:2|host:0-2
I1209 15:35:15.302727       1 worker-reconciler-chk.go:176] unknown:Unable to use full fan-out mode. Counters: modified: 3 . CR: tst01/odl
I1209 15:35:15.302755       1 worker.go:417] RaftOptions: exclude hosts: [], attributes: status: unknown, tags: 
I1209 15:35:15.302808       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:15.302826       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:15.302841       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:15.309888       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:15.316236       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-usersd

This is where all pod terminate:

I1209 15:35:25.322654       1 worker-pdb.go:36] PDB updated: tst01/chk-odl-cluster1
I1209 15:35:25.322691       1 worker-reconciler-chk.go:464] worker-reconciler-chk.go:464:reconcileShardsAndHosts():start:reconcileShardsAndHosts start
I1209 15:35:25.322704       1 worker-reconciler-helper.go:43] not found ReconcileShardsAndHostsOptionsCtxKey, use empty opts
I1209 15:35:25.322709       1 worker-reconciler-chk.go:479] starting first shard separately
I1209 15:35:25.323588       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.323613       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-0. Cur: 10b55d9005c501b3040a3efe37ba3010f787a9e1 New: 2f935a5ac30131f75b7120c9933399f3ce3e072b
I1209 15:35:25.323628       1 worker-reconciler-chk.go:618] reconcileHostPrepare():Host:0-0[0/0]:tst01/odl:Include host into cluster. Host/shard/cluster: 0/0/cluster1
I1209 15:35:25.323642       1 worker-exclude-include-wait.go:100] includeHostIntoRaftCluster():Host:0-0[0/0]:tst01/odl:going to include host. Host/shard/cluster: 0/0/cluster1
I1209 15:35:25.323652       1 worker.go:417] RaftOptions: exclude hosts: [], attributes: status: unknown, tags: 
I1209 15:35:25.323740       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:25.323766       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:25.323790       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:25.331103       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:25.338113       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-deploy-confd-cluster1-0-0
I1209 15:35:25.338133       1 worker-reconciler-chk.go:646] reconcileHostMain():Host:0-0[0/0]:tst01/odl:Reconcile PVCs and data loss for host: 0-0
I1209 15:35:25.338219       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-0[0/0]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-0-0/0-0)
I1209 15:35:25.347316       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-0[0/0]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-0-0/0-0)
I1209 15:35:25.347964       1 worker-service.go:39] reconcileService():unknown:Service found: tst01/chk-odl-cluster1-0-0. Will try to update
I1209 15:35:25.352294       1 worker-service.go:173] updateService():unknown:Update Service success: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.352312       1 worker-service.go:60] reconcileService():unknown:Service reconcile successful: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.352325       1 worker-reconciler-chk.go:394] reconcileHostService():Host:0-0[0/0]:tst01/odl:DONE Reconcile service of the host: 0-0
I1209 15:35:25.352340       1 worker-reconciler-chk.go:341] worker-reconciler-chk.go:341:reconcileHostStatefulSet():start:Host:0-0[0/0]:tst01/odl:reconcile StatefulSet start
I1209 15:35:25.352428       1 worker-reconciler-chk.go:347] reconcileHostStatefulSet():Host:0-0[0/0]:tst01/odl:Reconcile host: 0-0. App version: 99.99.99
I1209 15:35:25.352497       1 worker.go:164] shouldForceRestartHost():Host:0-0[0/0]:tst01/odl:Host force restart is not required. Host: 0-0
I1209 15:35:25.353175       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.353194       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-0. Cur: 10b55d9005c501b3040a3efe37ba3010f787a9e1 New: 2f935a5ac30131f75b7120c9933399f3ce3e072b
I1209 15:35:25.353209       1 worker-reconciler-chk.go:362] reconcileHostStatefulSet():Host:0-0[0/0]:tst01/odl:Reconcile host STS: 0-0. Reconcile StatefulSet
I1209 15:35:25.353257       1 statefulset-reconciler.go:163] ReconcileStatefulSet():Host:0-0[0/0]:tst01/odl:Need to reconcile MODIFIED StatefulSet: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.353700       1 util.go:39] StatefulSet.Spec ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .spec items num: 20
diff item [0]:'.Template.Spec.Containers[0].Ports[1].Protocol' = '""'
diff item [1]:'.Template.Spec.Containers[0].LivenessProbe.TimeoutSeconds' = '0'
diff item [2]:'.Template.Spec.Containers[0].TerminationMessagePath' = '""'
diff item [3]:'.Template.Spec.Containers[0].TerminationMessagePolicy' = '""'
diff item [4]:'.Template.Spec.RestartPolicy' = '""'
diff item [5]:'.Template.Spec.Containers[0].ReadinessProbe.ProbeHandler.HTTPGet.Scheme' = '""'
diff item [6]:'.Template.Spec.Containers[0].ReadinessProbe.TimeoutSeconds' = '0'
diff item [7]:'.Template.Spec.DNSPolicy' = '""'
diff item [8]:'.Template.Spec.SecurityContext' = 'nil'
diff item [9]:'.Template.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution[0].LabelSelector.MatchExpressions[0].Values[0]' = '"odlw"'
diff item [10]:'.VolumeClaimTemplates[0].TypeMeta.APIVersion' = '"v1"'
diff item [11]:'.VolumeClaimTemplates[0].ObjectMeta.Annotations' = 'map[string]string{
}'
diff item [12]:'.Template.Spec.Containers[0].LivenessProbe.SuccessThreshold' = '0'
diff item [13]:'.Template.Spec.Containers[0].ReadinessProbe.SuccessThreshold' = '0'
diff item [14]:'.Template.Spec.Containers[0].Ports[0].Protocol' = '""'
diff item [15]:'.Template.Spec.SchedulerName' = '""'
diff item [16]:'.VolumeClaimTemplates[0].TypeMeta.Kind' = '"PersistentVolumeClaim"'
diff item [17]:'.VolumeClaimTemplates[0].Status.Phase' = '""'
diff item [18]:'.PersistentVolumeClaimRetentionPolicy' = 'nil'
diff item [19]:'.Template.ObjectMeta.Annotations' = 'map[string]string{
}'
Diff end -------------------------

removed:
none
I1209 15:35:25.353728       1 util.go:50] StatefulSet.Labels ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .labels items num: 1
diff item [0]:'["clickhouse-keeper.altinity.com/object-version"]' = '"2f935a5ac30131f75b7120c9933399f3ce3e072b"'
Diff end -------------------------

removed:
none
I1209 15:35:25.353779       1 statefulset-reconciler.go:246] updateStatefulSet():Host:0-0[0/0]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-0) - started
I1209 15:35:25.353796       1 task.go:83] WaitForConfigMapPropagation():Host:0-0[0/0]:tst01/odl:No need to wait for ConfigMap propagation - no changes in ConfigMap
I1209 15:35:25.360953       1 statefulset-reconciler.go:439] doUpdateStatefulSet():Host:0-0[0/0]:tst01/odl:generation change 6=>7
W1209 15:35:25.360980       1 statefulset-reconciler.go:406] waitHostStatefulSetToLaunch():Host:0-0[0/0]:tst01/odl:Host is not properly launched - no waiting sts at all. Host: 0-0
I1209 15:35:25.374303       1 statefulset-reconciler.go:274] updateStatefulSet():Host:0-0[0/0]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-0) - completed
I1209 15:35:25.374355       1 worker-reconciler-chk.go:381] worker-reconciler-chk.go:342:reconcileHostStatefulSet():end:Host:0-0[0/0]:tst01/odl:reconcile StatefulSet end
I1209 15:35:25.374444       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-0[0/0]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-0-0/0-0)
I1209 15:35:25.384591       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-0[0/0]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-0-0/0-0)
I1209 15:35:25.384676       1 worker-reconciler-chk.go:602] reconcileHost():Host:0-0[0/0]:tst01/odl:[now: 2025-12-09 15:35:25.384612845 +0000 UTC m=+659.349933809] ProgressHostsCompleted: 1 of 3
I1209 15:35:25.397169       1 worker-service.go:39] reconcileService():unknown:Service found: tst01/keeper-odl. Will try to update
I1209 15:35:25.401904       1 worker-service.go:173] updateService():unknown:Update Service success: tst01/keeper-odl
I1209 15:35:25.401923       1 worker-service.go:60] reconcileService():unknown:Service reconcile successful: tst01/keeper-odl
I1209 15:35:25.402593       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.402620       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-1. Cur: b88fb7b36e936ac0be53c137765adfc987c77d03 New: 7678c212efeff068ae5b5b656e75de40bd7b7119
I1209 15:35:25.402632       1 worker-reconciler-chk.go:618] reconcileHostPrepare():Host:0-1[0/1]:tst01/odl:Include host into cluster. Host/shard/cluster: 1/0/cluster1
I1209 15:35:25.402641       1 worker-exclude-include-wait.go:100] includeHostIntoRaftCluster():Host:0-1[0/1]:tst01/odl:going to include host. Host/shard/cluster: 1/0/cluster1
I1209 15:35:25.402649       1 worker.go:417] RaftOptions: exclude hosts: [], attributes: status: unknown, tags: 
I1209 15:35:25.402699       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:25.402716       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:25.402730       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:25.409824       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:25.416704       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-deploy-confd-cluster1-0-1
I1209 15:35:25.416724       1 worker-reconciler-chk.go:646] reconcileHostMain():Host:0-1[0/1]:tst01/odl:Reconcile PVCs and data loss for host: 0-1
I1209 15:35:25.416815       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-1[0/1]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-1-0/0-1)
I1209 15:35:25.423136       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-1[0/1]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-1-0/0-1)
I1209 15:35:25.423781       1 worker-service.go:39] reconcileService():unknown:Service found: tst01/chk-odl-cluster1-0-1. Will try to update
I1209 15:35:25.428583       1 worker-service.go:173] updateService():unknown:Update Service success: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.428602       1 worker-service.go:60] reconcileService():unknown:Service reconcile successful: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.428637       1 worker-reconciler-chk.go:394] reconcileHostService():Host:0-1[0/1]:tst01/odl:DONE Reconcile service of the host: 0-1
I1209 15:35:25.428657       1 worker-reconciler-chk.go:341] worker-reconciler-chk.go:341:reconcileHostStatefulSet():start:Host:0-1[0/1]:tst01/odl:reconcile StatefulSet start
I1209 15:35:25.428753       1 worker-reconciler-chk.go:347] reconcileHostStatefulSet():Host:0-1[0/1]:tst01/odl:Reconcile host: 0-1. App version: 99.99.99
I1209 15:35:25.428821       1 worker.go:164] shouldForceRestartHost():Host:0-1[0/1]:tst01/odl:Host force restart is not required. Host: 0-1
I1209 15:35:25.429953       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.429998       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-1. Cur: b88fb7b36e936ac0be53c137765adfc987c77d03 New: 7678c212efeff068ae5b5b656e75de40bd7b7119
I1209 15:35:25.430027       1 worker-reconciler-chk.go:362] reconcileHostStatefulSet():Host:0-1[0/1]:tst01/odl:Reconcile host STS: 0-1. Reconcile StatefulSet
I1209 15:35:25.430089       1 statefulset-reconciler.go:163] ReconcileStatefulSet():Host:0-1[0/1]:tst01/odl:Need to reconcile MODIFIED StatefulSet: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.430507       1 util.go:39] StatefulSet.Spec ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .spec items num: 20
diff item [0]:'.Template.Spec.Containers[0].LivenessProbe.SuccessThreshold' = '0'
diff item [1]:'.Template.Spec.Containers[0].ReadinessProbe.ProbeHandler.HTTPGet.Scheme' = '""'
diff item [2]:'.Template.Spec.Containers[0].ReadinessProbe.TimeoutSeconds' = '0'
diff item [3]:'.Template.Spec.Containers[0].TerminationMessagePath' = '""'
diff item [4]:'.Template.Spec.RestartPolicy' = '""'
diff item [5]:'.VolumeClaimTemplates[0].Status.Phase' = '""'
diff item [6]:'.Template.ObjectMeta.Annotations' = 'map[string]string{
}'
diff item [7]:'.Template.Spec.Containers[0].Ports[0].Protocol' = '""'
diff item [8]:'.Template.Spec.Containers[0].TerminationMessagePolicy' = '""'
diff item [9]:'.Template.Spec.SecurityContext' = 'nil'
diff item [10]:'.VolumeClaimTemplates[0].TypeMeta.APIVersion' = '"v1"'
diff item [11]:'.Template.Spec.Containers[0].Ports[1].Protocol' = '""'
diff item [12]:'.Template.Spec.Containers[0].LivenessProbe.TimeoutSeconds' = '0'
diff item [13]:'.Template.Spec.Containers[0].ReadinessProbe.SuccessThreshold' = '0'
diff item [14]:'.Template.Spec.DNSPolicy' = '""'
diff item [15]:'.Template.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution[0].LabelSelector.MatchExpressions[0].Values[0]' = '"odlw"'
diff item [16]:'.VolumeClaimTemplates[0].TypeMeta.Kind' = '"PersistentVolumeClaim"'
diff item [17]:'.Template.Spec.SchedulerName' = '""'
diff item [18]:'.VolumeClaimTemplates[0].ObjectMeta.Annotations' = 'map[string]string{
}'
diff item [19]:'.PersistentVolumeClaimRetentionPolicy' = 'nil'
Diff end -------------------------

removed:
none
I1209 15:35:25.430532       1 util.go:50] StatefulSet.Labels ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .labels items num: 1
diff item [0]:'["clickhouse-keeper.altinity.com/object-version"]' = '"7678c212efeff068ae5b5b656e75de40bd7b7119"'
Diff end -------------------------

removed:
none
I1209 15:35:25.430580       1 statefulset-reconciler.go:246] updateStatefulSet():Host:0-1[0/1]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-1) - started
I1209 15:35:25.430597       1 task.go:83] WaitForConfigMapPropagation():Host:0-1[0/1]:tst01/odl:No need to wait for ConfigMap propagation - no changes in ConfigMap
I1209 15:35:25.437264       1 statefulset-reconciler.go:439] doUpdateStatefulSet():Host:0-1[0/1]:tst01/odl:generation change 6=>7
W1209 15:35:25.437293       1 statefulset-reconciler.go:406] waitHostStatefulSetToLaunch():Host:0-1[0/1]:tst01/odl:Host is not properly launched - no waiting sts at all. Host: 0-1
I1209 15:35:25.449958       1 statefulset-reconciler.go:274] updateStatefulSet():Host:0-1[0/1]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-1) - completed
I1209 15:35:25.449999       1 worker-reconciler-chk.go:381] worker-reconciler-chk.go:342:reconcileHostStatefulSet():end:Host:0-1[0/1]:tst01/odl:reconcile StatefulSet end
I1209 15:35:25.450084       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-1[0/1]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-1-0/0-1)
I1209 15:35:25.456123       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-1[0/1]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-1-0/0-1)
I1209 15:35:25.456162       1 worker-reconciler-chk.go:602] reconcileHost():Host:0-1[0/1]:tst01/odl:[now: 2025-12-09 15:35:25.456145899 +0000 UTC m=+659.421466873] ProgressHostsCompleted: 2 of 3
I1209 15:35:25.467080       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.467106       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-2. Cur: 667b58a6643ecece49f0be8367b3c2036d641684 New: 62b3203937325f3b6f1f28d0a4f989d21bd2deff
I1209 15:35:25.467119       1 worker-reconciler-chk.go:618] reconcileHostPrepare():Host:0-2[0/2]:tst01/odl:Include host into cluster. Host/shard/cluster: 2/0/cluster1
I1209 15:35:25.467132       1 worker-exclude-include-wait.go:100] includeHostIntoRaftCluster():Host:0-2[0/2]:tst01/odl:going to include host. Host/shard/cluster: 2/0/cluster1
I1209 15:35:25.467141       1 worker.go:417] RaftOptions: exclude hosts: [], attributes: status: unknown, tags: 
I1209 15:35:25.467203       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:25.467229       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:25.467250       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:25.473889       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:25.480428       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-deploy-confd-cluster1-0-2
I1209 15:35:25.480450       1 worker-reconciler-chk.go:646] reconcileHostMain():Host:0-2[0/2]:tst01/odl:Reconcile PVCs and data loss for host: 0-2
I1209 15:35:25.480536       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-2[0/2]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-2-0/0-2)
I1209 15:35:25.486827       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-2[0/2]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-2-0/0-2)
I1209 15:35:25.487474       1 worker-service.go:39] reconcileService():unknown:Service found: tst01/chk-odl-cluster1-0-2. Will try to update
I1209 15:35:25.492355       1 worker-service.go:173] updateService():unknown:Update Service success: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.492376       1 worker-service.go:60] reconcileService():unknown:Service reconcile successful: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.492397       1 worker-reconciler-chk.go:394] reconcileHostService():Host:0-2[0/2]:tst01/odl:DONE Reconcile service of the host: 0-2
I1209 15:35:25.492417       1 worker-reconciler-chk.go:341] worker-reconciler-chk.go:341:reconcileHostStatefulSet():start:Host:0-2[0/2]:tst01/odl:reconcile StatefulSet start
I1209 15:35:25.492506       1 worker-reconciler-chk.go:347] reconcileHostStatefulSet():Host:0-2[0/2]:tst01/odl:Reconcile host: 0-2. App version: 99.99.99
I1209 15:35:25.492565       1 worker.go:164] shouldForceRestartHost():Host:0-2[0/2]:tst01/odl:Host force restart is not required. Host: 0-2
I1209 15:35:25.494637       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.494722       1 object-status.go:54] GetObjectStatusFromMetas():unknown:cur and new objects ARE DIFFERENT based on object version label: Update of the object is required. Object: tst01/chk-odl-cluster1-0-2. Cur: 667b58a6643ecece49f0be8367b3c2036d641684 New: 62b3203937325f3b6f1f28d0a4f989d21bd2deff
I1209 15:35:25.494799       1 worker-reconciler-chk.go:362] reconcileHostStatefulSet():Host:0-2[0/2]:tst01/odl:Reconcile host STS: 0-2. Reconcile StatefulSet
I1209 15:35:25.494965       1 statefulset-reconciler.go:163] ReconcileStatefulSet():Host:0-2[0/2]:tst01/odl:Need to reconcile MODIFIED StatefulSet: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.496100       1 util.go:39] StatefulSet.Spec ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .spec items num: 20
diff item [0]:'.PersistentVolumeClaimRetentionPolicy' = 'nil'
diff item [1]:'.Template.Spec.Containers[0].Ports[1].Protocol' = '""'
diff item [2]:'.Template.Spec.Containers[0].ReadinessProbe.SuccessThreshold' = '0'
diff item [3]:'.Template.Spec.SecurityContext' = 'nil'
diff item [4]:'.Template.Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution[0].LabelSelector.MatchExpressions[0].Values[0]' = '"odlw"'
diff item [5]:'.Template.Spec.SchedulerName' = '""'
diff item [6]:'.VolumeClaimTemplates[0].ObjectMeta.Annotations' = 'map[string]string{
}'
diff item [7]:'.Template.Spec.RestartPolicy' = '""'
diff item [8]:'.VolumeClaimTemplates[0].Status.Phase' = '""'
diff item [9]:'.Template.ObjectMeta.Annotations' = 'map[string]string{
}'
diff item [10]:'.Template.Spec.Containers[0].Ports[0].Protocol' = '""'
diff item [11]:'.Template.Spec.Containers[0].LivenessProbe.TimeoutSeconds' = '0'
diff item [12]:'.Template.Spec.Containers[0].LivenessProbe.SuccessThreshold' = '0'
diff item [13]:'.Template.Spec.Containers[0].ReadinessProbe.TimeoutSeconds' = '0'
diff item [14]:'.Template.Spec.Containers[0].TerminationMessagePath' = '""'
diff item [15]:'.VolumeClaimTemplates[0].TypeMeta.APIVersion' = '"v1"'
diff item [16]:'.Template.Spec.Containers[0].ReadinessProbe.ProbeHandler.HTTPGet.Scheme' = '""'
diff item [17]:'.Template.Spec.Containers[0].TerminationMessagePolicy' = '""'
diff item [18]:'.Template.Spec.DNSPolicy' = '""'
diff item [19]:'.VolumeClaimTemplates[0].TypeMeta.Kind' = '"PersistentVolumeClaim"'
Diff end -------------------------

removed:
none
I1209 15:35:25.496125       1 util.go:50] StatefulSet.Labels ARE DIFFERENT:
added:
none
modified:
Diff start -------------------------
modified .labels items num: 1
diff item [0]:'["clickhouse-keeper.altinity.com/object-version"]' = '"62b3203937325f3b6f1f28d0a4f989d21bd2deff"'
Diff end -------------------------

removed:
none
I1209 15:35:25.496180       1 statefulset-reconciler.go:246] updateStatefulSet():Host:0-2[0/2]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-2) - started
I1209 15:35:25.496193       1 task.go:83] WaitForConfigMapPropagation():Host:0-2[0/2]:tst01/odl:No need to wait for ConfigMap propagation - no changes in ConfigMap
I1209 15:35:25.503401       1 statefulset-reconciler.go:439] doUpdateStatefulSet():Host:0-2[0/2]:tst01/odl:generation change 6=>7
W1209 15:35:25.503425       1 statefulset-reconciler.go:406] waitHostStatefulSetToLaunch():Host:0-2[0/2]:tst01/odl:Host is not properly launched - no waiting sts at all. Host: 0-2
I1209 15:35:25.514617       1 statefulset-reconciler.go:274] updateStatefulSet():Host:0-2[0/2]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-2) - completed
I1209 15:35:25.514659       1 worker-reconciler-chk.go:381] worker-reconciler-chk.go:342:reconcileHostStatefulSet():end:Host:0-2[0/2]:tst01/odl:reconcile StatefulSet end
I1209 15:35:25.514738       1 storage-reconciler.go:283] storage-reconciler.go:283:reconcilePVC():start:Host:0-2[0/2]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-2-0/0-2)
I1209 15:35:25.520271       1 storage-reconciler.go:303] storage-reconciler.go:284:reconcilePVC():end:Host:0-2[0/2]:tst01/odl:reconcile PVC (tst01/data-chk-odl-cluster1-0-2-0/0-2)
I1209 15:35:25.520307       1 worker-reconciler-chk.go:602] reconcileHost():Host:0-2[0/2]:tst01/odl:[now: 2025-12-09 15:35:25.520288635 +0000 UTC m=+659.485609599] ProgressHostsCompleted: 3 of 3
I1209 15:35:25.530544       1 worker-reconciler-chk.go:492] Starting rest of shards on workers: 1
I1209 15:35:25.530556       1 worker-reconciler-chk.go:525] Finished successfully rest of shards on workers: 1
I1209 15:35:25.530562       1 worker-reconciler-chk.go:526] worker-reconciler-chk.go:465:reconcileShardsAndHosts():end:reconcileShardsAndHosts end
I1209 15:35:25.530666       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:25.530691       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:25.530712       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:25.537203       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:25.537235       1 worker-deleter.go:41] clean():unknown:remove items scheduled for deletion
I1209 15:35:25.537249       1 worker-deleter.go:44] clean():unknown:List of objects which have failed to reconcile:
%!s(func() *model.Registry=0x18958c0)
I1209 15:35:25.537267       1 worker-deleter.go:45] clean():unknown:List of successfully reconciled objects:
%!s(func() *model.Registry=0x1895900)
I1209 15:35:25.537675       1 worker-deleter.go:48] clean():unknown:Existing objects:
Service: tst01/chk-odl-cluster1-0-1
Service: tst01/chk-odl-cluster1-0-2
Service: tst01/keeper-odl
Service: tst01/chk-odl-cluster1-0-0
PVC: tst01/data-chk-odl-cluster1-0-2-0
PVC: tst01/data-chk-odl-cluster1-0-0-0
PVC: tst01/data-chk-odl-cluster1-0-1-0
PDB: tst01/chk-odl-cluster1
StatefulSet: tst01/chk-odl-cluster1-0-0
StatefulSet: tst01/chk-odl-cluster1-0-1
StatefulSet: tst01/chk-odl-cluster1-0-2
ConfigMap: tst01/chk-odl-common-configd
ConfigMap: tst01/chk-odl-common-usersd
ConfigMap: tst01/chk-odl-deploy-confd-cluster1-0-1
ConfigMap: tst01/chk-odl-deploy-confd-cluster1-0-0
ConfigMap: tst01/chk-odl-deploy-confd-cluster1-0-2
I1209 15:35:25.537729       1 worker-deleter.go:50] clean():unknown:Non-reconciled objects:
I1209 15:35:25.537751       1 worker-exclude-include-wait.go:36] worker-exclude-include-wait.go:36:waitForIPAddresses():start:unknown:wait for IP addresses to be assigned to all pods
I1209 15:35:25.537948       1 worker-exclude-include-wait.go:49] unknown:all IP addresses are in place
I1209 15:35:25.537973       1 worker.go:200] worker.go:200:finalizeReconcileAndMarkCompleted():start:unknown:finalize reconcile
I1209 15:35:25.538570       1 worker.go:204] unknown:updating endpoints for CR-2 odl
I1209 15:35:25.538719       1 worker.go:206] unknown:IPs of the CR-2 finalize reconcile tst01/odl: len: 3 [100.112.0.19 100.112.3.175 100.112.1.15]
I1209 15:35:25.539158       1 worker.go:210] unknown:Update users IPS-2
I1209 15:35:25.548062       1 worker.go:236] finalizeReconcileAndMarkCompleted():unknown:reconcile completed successfully, task id: auto-68c5d60a-45e2-453b-98f6-ce653510538a
I1209 15:35:25.548087       1 worker-reconciler-chk.go:91] worker-reconciler-chk.go:50:reconcileCR():end:unknown
I1209 15:35:25.548878       1 worker-reconciler-chk.go:49] worker-reconciler-chk.go:49:reconcileCR():start:unknown
I1209 15:35:25.548904       1 worker.go:382] createTemplatedCR():unknown:CR has an ancestor, use it as a base for reconcile. CR: tst01/odl
I1209 15:35:25.550429       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.550451       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.551114       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.551130       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.551687       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.551703       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.553071       1 worker-reconciler-chk.go:102] unknown:IPs of the CR tst01/odl: len: 3 [100.112.0.19 100.112.3.175 100.112.1.15]
I1209 15:35:25.553093       1 worker.go:382] createTemplatedCR():unknown:CR has an ancestor, use it as a base for reconcile. CR: tst01/odl
I1209 15:35:25.554356       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.554374       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-0
I1209 15:35:25.554990       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.555004       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-1
I1209 15:35:25.555604       1 statefulset-reconciler.go:103] unknown:Have StatefulSet available, try to perform label-based comparison for sts: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.555617       1 object-status.go:47] GetObjectStatusFromMetas():unknown:cur and new objects are equal based on object version label. Update of the object is not required. Object: tst01/chk-odl-cluster1-0-2
I1209 15:35:25.556898       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-0[0/0]:tst01/odl:Host software version: 0-0 99.99.99[99.99.99/so far so]
I1209 15:35:25.556910       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-1[0/1]:tst01/odl:Host software version: 0-1 99.99.99[99.99.99/so far so]
I1209 15:35:25.556914       1 worker-reconciler-chk.go:154] logSWVersion():Host:0-2[0/2]:tst01/odl:Host software version: 0-2 99.99.99[99.99.99/so far so]
I1209 15:35:25.556921       1 worker-reconciler-chk.go:157] logSWVersion():unknown:CR software versions [min, max]: 99.99.99[99.99.99/] 99.99.99[99.99.99/so far so]
I1209 15:35:25.557401       1 worker-reconciler-chk.go:55] unknown:ActionPlan start buildCR ---------------------------------------------:
Diff start -------------------------
modified spec items num: 1
diff item [0]:'.Templates.PodTemplates[0].Spec.Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution[0].LabelSelector.MatchExpressions[0].Values[0]' = '"odlw"'
Diff end -------------------------

ActionPlan end buildCR ---------------------------------------------
I1209 15:35:25.557417       1 worker-reconciler-chk.go:59] reconcileCR():unknown:ActionPlan has actions - continue reconcile
I1209 15:35:25.577985       1 worker.go:190] markReconcileStart():unknown:reconcile started, task id: auto-0f97e7cc-59e1-45a1-b282-109c7b12ac49
I1209 15:35:25.578326       1 worker.go:353] unknown:Add host as FOUND via host because host has an ancestor. Host: 0-0
I1209 15:35:25.578376       1 worker.go:353] unknown:Add host as FOUND via host because host has an ancestor. Host: 0-1
I1209 15:35:25.578410       1 worker.go:353] unknown:Add host as FOUND via host because host has an ancestor. Host: 0-2
I1209 15:35:25.578420       1 worker.go:373] Host:0-0[0/0]:tst01/odl:Host status: found. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:0|host:0-0
I1209 15:35:25.578428       1 worker.go:373] Host:0-1[0/1]:tst01/odl:Host status: found. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:1|host:0-1
I1209 15:35:25.578436       1 worker.go:373] Host:0-2[0/2]:tst01/odl:Host status: found. Host: ns:tst01|chi:odl|clu:cluster1|sha:0|rep:2|host:0-2
I1209 15:35:25.578463       1 worker-reconciler-chk.go:176] unknown:Unable to use full fan-out mode. Counters: found: 3 . CR: tst01/odl
I1209 15:35:25.578492       1 worker.go:417] RaftOptions: exclude hosts: [], attributes: status: unknown, tags: 
I1209 15:35:25.578546       1 generator.go:94] Host:0-0[0/0]:tst01/odl:Add host to RAFT servers: 0-0
I1209 15:35:25.578573       1 generator.go:94] Host:0-1[0/1]:tst01/odl:Add host to RAFT servers: 0-1
I1209 15:35:25.578590       1 generator.go:94] Host:0-2[0/2]:tst01/odl:Add host to RAFT servers: 0-2
I1209 15:35:25.585347       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-configd
I1209 15:35:25.591929       1 worker-config-map.go:81] updateConfigMap():unknown:Update ConfigMap tst01/chk-odl-common-usersd

perosb avatar Dec 09 '25 15:12 perosb

CHI updateStatefulSet seem to wait for configmap, and then it finds an updated pod:

I1209 16:09:35.530512       1 statefulset-reconciler.go:246] updateStatefulSet():Host:0-0[0/0]:tst01/odl:Update StatefulSet(tst01/chi-odl-shard1-repl2-0-0) - started
I1209 16:09:35.719159       1 task.go:106] WaitForConfigMapPropagation():Host:0-0[0/0]:tst01/odl:Going to wait for ConfigMap propagation for: 8.213296849s [elapsed/timeout: 1.786703151s/10s]
I1209 16:09:43.933242       1 task.go:112] WaitForConfigMapPropagation():Host:0-0[0/0]:tst01/odl:Wait completed for: 8.213296849s  of timeout: 10s]
I1209 16:09:43.945281       1 statefulset-reconciler.go:439] doUpdateStatefulSet():Host:0-0[0/0]:tst01/odl:generation change 4=>5
I1209 16:09:43.945315       1 statefulset-reconciler.go:394] waitHostStatefulSetToLaunch():Host:0-0[0/0]:tst01/odl:Wait host sts ready. Host: 0-0

CHK doesn't wait for configmap (probably no need) but immediately complains that no sts pod is available.

Nothing is being logged from waitHostStatefulSetToLaunch and WaitUntilReady eventho readinessProbe is configured.

I1209 15:26:23.622488       1 statefulset-reconciler.go:246] updateStatefulSet():Host:0-2[0/2]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-2) - started
I1209 15:26:23.622506       1 task.go:83] WaitForConfigMapPropagation():Host:0-2[0/2]:tst01/odl:No need to wait for ConfigMap propagation - no changes in ConfigMap
I1209 15:26:23.629551       1 statefulset-reconciler.go:439] doUpdateStatefulSet():Host:0-2[0/2]:tst01/odl:generation change 5=>6
W1209 15:26:23.629623       1 statefulset-reconciler.go:406] waitHostStatefulSetToLaunch():Host:0-2[0/2]:tst01/odl:Host is not properly launched - no waiting sts at all. Host: 0-2
I1209 15:26:23.640496       1 statefulset-reconciler.go:274] updateStatefulSet():Host:0-2[0/2]:tst01/odl:Update StatefulSet(tst01/chk-odl-cluster1-0-2) - completed
I1209 15:26:23.640528       1 worker-reconciler-chk.go:381] worker-reconciler-chk.go:342:reconcileHostStatefulSet():end:Host:0-2[0/2]:tst01/odl:reconcile StatefulSet end

perosb avatar Dec 09 '25 16:12 perosb

possibly this code is missing from the reconileoptions for CHK so it never waits for probes?

and the call to it from reconcileHostStatefulSet

func (w *worker) prepareStsReconcileOptsWaitSection(host *api.Host, opts *statefulset.ReconcileOptions) *statefulset.ReconcileOptions {
	if host.GetCluster().GetReconcile().Host.Wait.Probes.GetStartup().IsTrue() {
		opts = opts.SetWaitUntilStarted()
		w.a.V(1).
			M(host).F().
			Warning("Setting option SetWaitUntilStarted ")
	}
	if host.GetCluster().GetReconcile().Host.Wait.Probes.GetReadiness().IsTrue() {
		opts = opts.SetWaitUntilReady()
		w.a.V(1).
			M(host).F().
			Warning("Setting option SetWaitUntilReady")
	}
	return opts
}

perosb avatar Dec 09 '25 16:12 perosb