mysql-operator
mysql-operator copied to clipboard
Operator Pods don't get ready after replication
Current Setup
We are running an operator deployment defined by this template in namespace mysql-operator:
helm template presslabs presslabs/mysql-operator -n mysql-operator \
--version 0.4.0 \
--include-crds \
--set antiAffinity=hard \
--set orchestrator.persistence.storageClass=local-path \
> cluster01/mysql-operator/mysql-operator.yaml
This works without problems. Now we want to scale the deployment up by using the following template:
helm template presslabs presslabs/mysql-operator -n mysql-operator \
--version 0.4.0 \
--include-crds \
--set antiAffinity=hard \
--set orchestrator.persistence.storageClass=local-path \
--set orchestrator.topologyPassword=<REDACTED>\
--set replicas=3\
> cluster01/mysql-operator/mysql-operator.yaml
Problem
After applying the new template the operator is replicated as expected. But the pods don't get ready anymore:
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
presslabs-mysql-operator-0 2/2 Running 0 7d
presslabs-mysql-operator-1 1/2 Running 0 21m
presslabs-mysql-operator-2 1/2 Running 0 21m
By taking a look at the logs I found that this is a problem in the orchestrator container:
Output of $ kubectl logs presslabs-mysql-operator-1 -c orchestrator is attached.
log.txt
Is it possible to report to the orchestrator in order to get their help?
@SF2311 this still happens with 0.5.0?
Is there any documentation I can refer to regarding the upgrade process from version 0.4.0 to 0.5.0?
https://github.com/bitpoke/mysql-operator/blob/master/docs/operator-upgrades.md
hi, you can refer to v0.3.x upgrade
I reproduced the issue. Is your Kubernetes version 1.19?
link: #744
Yes we are running Kubernetes v1.19. After the upgrade to v0.5.1 of the operator the problem persists.
My solution was to actively delete all the MySQL operator pods after the upgrade.
Did this solve the problem in the long term? Because the first four days after the upgrade the operator worked fine, but then spontaneously the replication failed. So I'm not convinced that deleting the pods will fix this long term for me.