percona-xtradb-cluster-operator icon indicating copy to clipboard operation
percona-xtradb-cluster-operator copied to clipboard

K8SPXC-1462: Restart PXC pods after monitor user password change

Open s10 opened this issue 1 year ago • 6 comments

K8SPXC-1462 Powered by Pull Request Badge


Problem: monitor user could be used not only in pmm sidecar containers, but in custom mysqld-exporter sidecar container running near pxc. If these custom sidecars are using monitor user, they need a pod restart after password change, same as pmm sidecars.

Cause: monitor user password update causes pxc pods restart only when pmm is enabled

Solution: Restart PXC pods without checking if a PMM is enabled.

CHECKLIST

Jira

  • [ ] Is the Jira ticket created and referenced properly?
  • [ ] Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • [ ] Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • [ ] Is an E2E test/test case added for the new feature/change?
  • [ ] Are unit tests added where appropriate?
  • [ ] Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • [ ] Are all needed new/changed options added to default YAML files?
  • [ ] Did we add proper logging messages for operator actions?
  • [ ] Did we ensure compatibility with the previous version or cluster upgrade process?
  • [ ] Does the change support oldest and newest supported PXC version?
  • [ ] Does the change support oldest and newest supported Kubernetes version?

s10 avatar Sep 13 '24 12:09 s10

I agree with @hors, I don't think it's a good idea to restart PXC pods unless we know for sure that restart is needed. @s10 how do you use monitor password in custom sidecar? env, envFromSecret or volume? maybe we can add a check for these and decide if restart is needed.

egegunes avatar Sep 27 '24 09:09 egegunes

I use envFromSecret:

    sidecars:
    - name: metrics
      image: prom/mysqld-exporter:v0.15.1
      env:
        - name: MYSQLD_EXPORTER_PASSWORD
          valueFrom:
            secretKeyRef:
              name: pxc-dev-1-secrets
              key: monitor
      ports:
        - name: metrics
          containerPort: 9104
      args:
        - "--mysqld.username=monitor"
        - "--mysqld.address=localhost:3306"
        - "--collect.binlog_size"
 ...

s10 avatar Sep 27 '24 10:09 s10

It might be good to check keys for system usernames in env and envFromSecret of sidecar container and decide if it needs a restart. wdyt @hors @spron-in?

egegunes avatar Sep 27 '24 20:09 egegunes

It might be good to check keys for system usernames in env and envFromSecret of sidecar container and decide if it needs a restart. wdyt @hors @spron-in?

It is a good idea.

hors avatar Sep 30 '24 08:09 hors

I understand the desire to minimize the number of restarts, but also want to be sure that we are not overcomplicating it here. I would not expect a user to change the password for monitor user too often or to change it when monitoring in some form is not enabled. So we are going to implement more checks for quite a rare operation and the operation that actually justifies pod restart.

Am I looking at it wrong?

spron-in avatar Oct 16 '24 09:10 spron-in

@spron-in i really don't want to restart PXC pods blindly. we can always say user should restart pods themselves but if we want to do it in automatic way, i believe current changes look good.

egegunes avatar Nov 11 '24 10:11 egegunes

@inelpandzic please review

egegunes avatar Nov 20 '24 18:11 egegunes

Test name Status
affinity-8-0 passed
auto-tuning-8-0 passed
cross-site-8-0 passed
custom-users-8-0 passed
demand-backup-cloud-8-0 passed
demand-backup-encrypted-with-tls-8-0 passed
demand-backup-8-0 passed
haproxy-5-7 passed
haproxy-8-0 passed
init-deploy-5-7 passed
init-deploy-8-0 passed
limits-8-0 passed
monitoring-2-0-8-0 passed
one-pod-5-7 passed
one-pod-8-0 passed
pitr-8-0 passed
pitr-gap-errors-8-0 passed
proxy-protocol-8-0 passed
proxysql-sidecar-res-limits-8-0 passed
pvc-resize-5-7 passed
pvc-resize-8-0 passed
recreate-8-0 passed
restore-to-encrypted-cluster-8-0 passed
scaling-proxysql-8-0 passed
scaling-8-0 passed
scheduled-backup-5-7 passed
scheduled-backup-8-0 passed
security-context-8-0 passed
smart-update1-8-0 passed
smart-update2-8-0 passed
storage-8-0 passed
tls-issue-cert-manager-ref-8-0 passed
tls-issue-cert-manager-8-0 passed
tls-issue-self-8-0 passed
upgrade-consistency-8-0 passed
upgrade-haproxy-5-7 passed
upgrade-haproxy-8-0 passed
upgrade-proxysql-5-7 passed
upgrade-proxysql-8-0 passed
users-5-7 passed
users-8-0 passed
validation-hook-8-0 passed
We run 42 out of 42

commit: https://github.com/percona/percona-xtradb-cluster-operator/pull/1816/commits/fc65a9be98974c6d1b063fb32ca1c2460db10682 image: perconalab/percona-xtradb-cluster-operator:PR-1816-fc65a9be

JNKPercona avatar Dec 05 '24 15:12 JNKPercona

@s10 Thank you for your contribution.

hors avatar Dec 05 '24 18:12 hors

Hey @s10, JFYI we created two improvement tickets related to these changes.

  1. https://perconadev.atlassian.net/browse/K8SPXC-1523
  2. https://perconadev.atlassian.net/browse/K8SPXC-1524

egegunes avatar Dec 12 '24 15:12 egegunes