endless "failed to update status"
Report
the ps-controller logs variations of this A LOT -- it seems like it's not sufficiently loading the current status before updating it.
More about the problem
2025-09-26T00:43:05.954Z ERROR failed to update status {"controller": "ps-controller", "controllerGroup": "ps.percona.com", "controllerKind": "PerconaServerMySQL", "PerconaServerMySQL": {"name":"ntpdb","namespace":"ntpdb"}, "namespace": "ntpdb", "name": "ntpdb", "reconcileID": "7db885ce-83b1-407d-b939-19162eece731", "error": "Operation cannot be fulfilled on perconaservermysqls.ps.percona.com \"ntpdb\": the object has been modified; please apply your changes to the latest version and try again"}
Steps to reproduce
- run a cluster?
Versions
- Kubernetes - v1.32.6
- Operator - v0.12.0
- Database - v8.0.43 (?)
Anything else?
No response
Here's the operator log. I had reset a cluster entirely because group replication had failed, and it looks like the operator never got to update the status with the cluster being initialized.
This is the status from the CR -- last transition time here was ~9 hours ago (the cluster is ~17 hours old, so not sure what happened 9 hours ago).
Status:
Conditions:
Last Transition Time: 2025-09-25T15:58:03Z
Message:
Reason: Initializing
Status: False
Type: Initializing
Last Transition Time: 2025-09-25T15:58:03Z
Message:
Reason: Ready
Status: True
Type: Ready
Last Transition Time: 2025-09-25T16:08:18Z
Message: replication: reconcile group replication: reconcile bootstrap status: wait for cached cr to updated with condition: context deadline exceeded
Reason: ErrorReconcile
Status: True
Type: Error
Haproxy:
Ready: 2
Size: 2
State: ready
Host: ntpdb-haproxy.ntpdb
Mysql:
Ready: 3
Size: 3
State: ready
Another quirk -- I don't see in the operator code the mysql_innodb_cluster_xxxx users, but I get a lot of things like this in the logs:
2025-09-26T01:20:01.900862Z 36 [Note] [MY-010926] [Server] Access denied for user 'mysql_innodb_cluster_42132052'@'10.42.4.8' (using password: YES)
Again this is a pretty "plain" cluster that was setup fresh with the 0.12.0 operator yesterday.
CR attached here:
@abh Thank you for the testing. We will check it today.
Here's the operator log. I had reset a cluster entirely because group replication had failed, and it looks like the operator never got to update the status with the cluster being initialized.
How did you do it? How did you reset it? I need the STR (steps to reproduce).
I deleted the CR's, the PVCs, the STS, and shutdown the operator and then build it back and restored from a mysqldump for each of the databases I needed.
2025-09-26T01:20:01.900862Z 36 [Note] [MY-010926] [Server] Access denied for user 'mysql_innodb_cluster_42132052'@'10.42.4.8' (using password: YES)
Do you have user creation in your dump file? Please use these two options when you create a dump:
mysqldump --skip-add-drop-user --skip-user-creation -u root -p your_database > new_dump.sql
or '--skip-system-users'.
@abh I am fairly sure that your status error will go away in v1.0.0 thanks to this commit: https://github.com/percona/percona-server-mysql-operator/commit/16ee670039835c54e12eedff81189ba373f7e3ef
@abh I’m happy to inform you that the MySQL Kubernetes Operator v1.0.0 was released today. This is the GA release, and you can start using it for production workloads. I also want to say a big thank you for using our operator and helping us improve it by providing feedback.