postgres-operator
postgres-operator copied to clipboard
the Operator rolling update statefulsets unnecessary whien Kube API down.
- Which image of the operator are you using? registry.opensource.zalan.do/acid/postgres-operator:v1.8.2
- Where do you run it - cloud or metal? Kubernetes or OpenShift? Kubernetes
- Are you running Postgres Operator in production? yes
- Type of issue? Bug report
Hello,
The operator may rollout statefulsets unnecessary if the Kube API is down.
In
https://github.com/zalando/postgres-operator/blob/6d0117b662bc2fd7880352b58258792ff2941d0a/pkg/cluster/k8sres.go#L992-L1023
If the operator cannot fetch the secret because of etcdserver: request timed out" cluster-name=namespace/pg-cluster pkg=cluster, it will return an empty slice then the operator will construct the desired statefulset without the env variables and update the statefulset with:
level=warning msg="could not read Secret PodEnvironmentSecretName: etcdserver: request timed out" cluster-name=namespace/pg-cluster pkg=cluster
...
level=debug msg="mark rolling update annotation for pg-cluster: reason pod changes" cluster-name=namespace/pg-cluster pkg=cluster
level=info msg="statefulset namespace/pg-cluster is not in the desired state and needs to be updated" cluster-name=namespace/pg-cluster pkg=cluster
... SHOWING the diff ...
...
level=info msg="reason: new statefulset containers's postgres (index 0) environment does not match the current one" cluster-name=namespace/pg-cluster pkg=cluster
level=debug msg="updating statefulset" cluster-name=namespace/pg-cluster pkg=cluster
...
level=debug msg="performing rolling update" cluster-name=namespace/pg-cluster pkg=cluster
Of course, when the Kube API is up again, the operator will perform another rolling update to re-inject the secret's content.
Is this the acceptable behaviour ? not for me, I think the operator should do so only in case the secret apierrors.IsNotFound (was deleted because not needed anymore, and even in this case I'm not sure), in this case I think the operator should stop sync the clusters and wait until it has a clear vision of the real current state.
We use one node PG clusters with standby clusters as we don't want to activate failover for now. But I think even with multi-nodes PG clusters this will still perform 2 rolling updates for nothing.