kubeblocks icon indicating copy to clipboard operation
kubeblocks copied to clipboard

[BUG] Restoring PG clusters with kubectl secondary instance always reports password auth failed

Open wutz opened this issue 1 year ago • 14 comments

Describe the bug

2024-04-01 09:20:50,501 INFO: Lock owner: pg3-postgresql-1; I am pg3-postgresql-0
2024-04-01 09:20:50,506 INFO: Local timeline=11 lsn=0/18112A30
2024-04-01 09:20:50,515 ERROR: Exception when working with leader
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/rewind.py", line 69, in check_leader_is_not_in_recovery
    with get_connection_cursor(connect_timeout=3, options='-c statement_timeout=2000', **conn_kwargs) as cur:
  File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.10/dist-packages/patroni/postgresql/connection.py", line 43, in get_connection_cursor
    conn = psycopg.connect(**kwargs)
  File "/usr/local/lib/python3.10/dist-packages/patroni/psycopg.py", line 42, in connect
    ret = _connect(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/psycopg2/__init__.py", line 122, in connect
    conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: connection to server at "10.64.0.96", port 5432 failed: FATAL:  password authentication failed for user "postgres"
connection to server at "10.64.0.96", port 5432 failed: FATAL:  password authentication failed for user "postgres"

2024-04-01 09:20:50,519 INFO: no action. I am (pg3-postgresql-0), a secondary, and following a leader (pg3-postgresql-1)

To Reproduce Steps to reproduce the behavior:

Follow the backup restore guide, but restore using kubectl instead of kbcli cluster restore.

Expected behavior

Set the correct connection password when kubectl restores the cluster.

Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • Kubernetes: v1.28.7+k3s1
  • KubeBlocks: 0.8.2
  • kbcli: 0.8.2

Additional context

I manually changed the password in secrets <cluster-name>-conn-credential and it worked fine

wutz avatar Apr 01 '24 09:04 wutz

I think the kbcli restore cluster has a little more logic than the kubectl method to set the connection password correctly

wutz avatar Apr 01 '24 09:04 wutz

@wutz you must to set the correct connectionPassword in the new cluster annotation from the specified backup's annotation when you use kubectl to restore cluster.

wangyelei avatar Apr 01 '24 10:04 wangyelei

I get connection password from backup annotations dataprotection.kubeblocks.io/connection-password

wutz avatar Apr 01 '24 10:04 wutz

I get connection password from backup annotations dataprotection.kubeblocks.io/connection-password

thanks, I will check it.

wangyelei avatar Apr 01 '24 10:04 wangyelei

Hi, I try it but the problem didn't occur. can you provider the backup and cluster yaml in here?

wangyelei avatar Apr 01 '24 10:04 wangyelei

Backup

apiVersion: dataprotection.kubeblocks.io/v1alpha1
kind: Backup
metadata:
  annotations:
    dataprotection.kubeblocks.io/connection-password: vNK8Db9/6RagdL2Rppw5v5Av/3zU/rLAlUgtCsJ0OriITmxr
    dataprotection.kubeblocks.io/target-pod-name: pg-postgresql-0
    kubeblocks.io/cluster-snapshot: '{"metadata":{"name":"pg","namespace":"default","creationTimestamp":null},"spec":{"clusterDefinitionRef":"postgresql","clusterVersionRef":"postgresql-14.8.0","terminationPolicy":"Halt","componentSpecs":[{"name":"postgresql","componentDefRef":"postgresql","enabledLogs":["running"],"replicas":3,"resources":{"limits":{"cpu":"8","memory":"16Gi"},"requests":{"cpu":"4","memory":"8Gi"}},"volumeClaimTemplates":[{"name":"data","spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"100Gi"}},"storageClassName":"local-path"}}],"switchPolicy":{"type":"Noop"},"serviceAccountName":"kb-pg","rsmTransformPolicy":"ToSts"}],"affinity":{"podAntiAffinity":"Preferred","topologyKeys":["kubernetes.io/hostname"],"tenancy":"SharedNode"},"resources":{"cpu":"0","memory":"0"},"storage":{"size":"0"},"monitor":{}},"status":{}}'
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"dataprotection.kubeblocks.io/v1alpha1","kind":"Backup","metadata":{"annotations":{"dataprotection.kubeblocks.io/connection-password":"d2tpgxck"},"name":"mybackup","namespace":"default"},"spec":{"backupMethod":"pg-basebackup","backupPolicyName":"pg-postgresql-backup-policy"}}
  creationTimestamp: "2024-04-01T09:06:03Z"
  finalizers:
  - dataprotection.kubeblocks.io/finalizer
  generation: 1
  labels:
    app.kubernetes.io/instance: pg
    app.kubernetes.io/managed-by: kubeblocks-dataprotection
    apps.kubeblocks.io/component-name: postgresql
    dataprotection.kubeblocks.io/backup-policy: pg-postgresql-backup-policy
    dataprotection.kubeblocks.io/backup-repo-name: default
    dataprotection.kubeblocks.io/backup-type: Full
    dataprotection.kubeblocks.io/cluster-uid: 49632715-4a86-434c-ac5c-9f29e22ff893
  name: mybackup
  namespace: default
  resourceVersion: "13995185"
  uid: 9a591b9b-dcf8-412b-8c9d-b60924820753
spec:
  backupMethod: pg-basebackup
  backupPolicyName: pg-postgresql-backup-policy
  deletionPolicy: Delete
status:
  actions:
  - actionType: Job
    completionTimestamp: "2024-04-01T09:06:13Z"
    name: dp-backup-0
    objectRef:
      apiVersion: batch/v1
      kind: Job
      name: dp-backup-0-mybackup-9a591b9b
      namespace: default
      resourceVersion: "13995182"
      uid: 6fb51f79-6851-490f-9c8c-290e676da81b
    phase: Completed
    startTimestamp: "2024-04-01T09:06:03Z"
  backupMethod:
    actionSetName: postgres-basebackup
    env:
    - name: IMAGE_TAG
      value: 14.8.0-pgvector-v0.5.0
    name: pg-basebackup
    snapshotVolumes: false
    targetVolumes:
      volumeMounts:
      - mountPath: /home/postgres/pgdata
        name: data
  backupRepoName: default
  completionTimestamp: "2024-04-01T09:06:13Z"
  duration: 11s
  formatVersion: 0.1.0
  path: /default/pg-49632715-4a86-434c-ac5c-9f29e22ff893/postgresql/mybackup
  persistentVolumeClaimName: pvc-default-5xrm94
  phase: Completed
  startTimestamp: "2024-04-01T09:06:03Z"
  target:
    connectionCredential:
      passwordKey: password
      secretName: pg-conn-credential
      usernameKey: username
    podSelector:
      matchLabels:
        app.kubernetes.io/instance: pg
        app.kubernetes.io/managed-by: kubeblocks
        apps.kubeblocks.io/component-name: postgresql
        kubeblocks.io/role: secondary
      strategy: Any
    serviceAccountName: kb-pg
  timeRange:
    end: "2024-04-01T09:06:05Z"
    start: "2024-04-01T00:00:00Z"
  totalSize: "4904748"

Cluster

apiVersion: apps.kubeblocks.io/v1alpha1
kind: Cluster
metadata:
  name: pg3
  labels:
    app.kubernetes.io/version: "14.8.0"
    app.kubernetes.io/instance: pg3
  annotations:
    kubeblocks.io/restore-from-backup: '{"postgresql":{"connectionPassword":"vNK8Db9/6RagdL2Rppw5v5Av/3zU/rLAlUgtCsJ0OriITmxr","name":"mybackup","namespace":"default","volumeRestorePolicy":"Parallel"}}'
spec:
  clusterVersionRef: postgresql-14.8.0
  terminationPolicy: Halt
  affinity:
    podAntiAffinity: Preferred
    topologyKeys:
      - kubernetes.io/hostname
    tenancy: SharedNode
  clusterDefinitionRef: postgresql
  componentSpecs:
    - name: postgresql
      componentDefRef: postgresql
      monitor: false
      replicas: 3
      enabledLogs:
        - running
      serviceAccountName: kb-pg
      switchPolicy:
        type: Noop
      resources:
        limits:
          cpu: "8"
          memory: "16Gi"
        requests:
          cpu: "4"
          memory: "8Gi"
      volumeClaimTemplates:
        - name: data # ref clusterDefinition components.containers.volumeMounts.name
          spec:
            storageClassName: local-path
            accessModes:
              - ReadWriteOnce
            resources:
              requests:
                storage: 100Gi
      services:

wutz avatar Apr 01 '24 10:04 wutz

have you modified the encryptionKey after backup?

wangyelei avatar Apr 01 '24 13:04 wangyelei

At first, I didn't set up encryptionKey for backup and restore, but only through kbcli succeeded. Later, I rechecked the tutorial to set up encryptionKey.

wutz avatar Apr 01 '24 13:04 wutz

the encryptionKey is global, and the backup created before updating the key may result in incorrect decrypted passwords due to inconsistent keys.

wangyelei avatar Apr 01 '24 13:04 wangyelei

firstly,it is better to set the encryptionKey after installing kb.

wangyelei avatar Apr 01 '24 13:04 wangyelei

After update encryptionKey, I re-created backup and used kubectl to restore, still reporting password authentication failure.

wutz avatar Apr 02 '24 02:04 wutz

I check that the password in the instance is the password of the original cluster.

However, pg3-conn-credential contains the newly generated password.

wutz avatar Apr 02 '24 02:04 wutz

There is a strange thing about this, connect the original cluster using psql -U postgres -W (Enter any pod to execute) can be authenticated by entering any password.

Currently, due to the password being injected into the pod, exec pod is equivalent to having access to the host, making it easy to obtain the password. system administrator need to strictly control the permissions of exec.

wangyelei avatar Apr 02 '24 02:04 wangyelei

This issue has been marked as stale because it has been open for 30 days with no activity

github-actions[bot] avatar May 06 '24 00:05 github-actions[bot]