percona-server-mysql-operator endless "failed to update status"

Report

the ps-controller logs variations of this A LOT -- it seems like it's not sufficiently loading the current status before updating it.

Steps to reproduce

run a cluster?

Versions

Kubernetes - v1.32.6
Operator - v0.12.0
Database - v8.0.43 (?)

Anything else?

No response

Sep 26 '25 01:09 abh

Here's the operator log. I had reset a cluster entirely because group replication had failed, and it looks like the operator never got to update the status with the cluster being initialized.

percona-operator-log.txt

This is the status from the CR -- last transition time here was ~9 hours ago (the cluster is ~17 hours old, so not sure what happened 9 hours ago).

Status:
  Conditions:
    Last Transition Time:  2025-09-25T15:58:03Z
    Message:
    Reason:                Initializing
    Status:                False
    Type:                  Initializing
    Last Transition Time:  2025-09-25T15:58:03Z
    Message:
    Reason:                Ready
    Status:                True
    Type:                  Ready
    Last Transition Time:  2025-09-25T16:08:18Z
    Message:               replication: reconcile group replication: reconcile bootstrap status: wait for cached cr to updated with condition: context deadline exceeded
    Reason:                ErrorReconcile
    Status:                True
    Type:                  Error
  Haproxy:
    Ready:  2
    Size:   2
    State:  ready
  Host:     ntpdb-haproxy.ntpdb
  Mysql:
    Ready:  3
    Size:   3
    State:  ready

Sep 26 '25 01:09 abh

Another quirk -- I don't see in the operator code the mysql_innodb_cluster_xxxx users, but I get a lot of things like this in the logs:

2025-09-26T01:20:01.900862Z 36 [Note] [MY-010926] [Server] Access denied for user 'mysql_innodb_cluster_42132052'@'10.42.4.8' (using password: YES)

Again this is a pretty "plain" cluster that was setup fresh with the 0.12.0 operator yesterday.

CR attached here:

ntpdb.yaml

Sep 26 '25 01:09 abh

@abh Thank you for the testing. We will check it today.

Sep 26 '25 07:09 hors

Here's the operator log. I had reset a cluster entirely because group replication had failed, and it looks like the operator never got to update the status with the cluster being initialized.

How did you do it? How did you reset it? I need the STR (steps to reproduce).

Sep 26 '25 07:09 hors

I deleted the CR's, the PVCs, the STS, and shutdown the operator and then build it back and restored from a mysqldump for each of the databases I needed.

Sep 27 '25 20:09 abh

2025-09-26T01:20:01.900862Z 36 [Note] [MY-010926] [Server] Access denied for user 'mysql_innodb_cluster_42132052'@'10.42.4.8' (using password: YES)

Do you have user creation in your dump file? Please use these two options when you create a dump: mysqldump --skip-add-drop-user --skip-user-creation -u root -p your_database > new_dump.sql

or '--skip-system-users'.

Sep 28 '25 08:09 hors

@abh I am fairly sure that your status error will go away in v1.0.0 thanks to this commit: https://github.com/percona/percona-server-mysql-operator/commit/16ee670039835c54e12eedff81189ba373f7e3ef

Oct 31 '25 15:10 egegunes

@abh I’m happy to inform you that the MySQL Kubernetes Operator v1.0.0 was released today. This is the GA release, and you can start using it for production workloads. I also want to say a big thank you for using our operator and helping us improve it by providing feedback.

Nov 17 '25 19:11 hors

endless "failed to update status"

Report

More about the problem

Steps to reproduce

Versions

Anything else?