postgres-operator
postgres-operator copied to clipboard
Replicas not able to connect to leader
Overview
I started upgrade to pgo version 5.1 from 5.05, it is in progress, at least update pod exists. Replicas pods are not able to communicate with leader pod. Patroni is throwing exception:
2022-04-28 13:44:26,426 ERROR: Exception when working with leader Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/rewind.py", line 60, in check_leader_is_not_in_recovery with get_connection_cursor(connect_timeout=3, options='-c statement_timeout=2000', **conn_kwargs) as cur: File "/usr/lib64/python3.6/contextlib.py", line 81, in enter return next(self.gen) File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/connection.py", line 44, in get_connection_cursor conn = psycopg.connect(**kwargs) File "/usr/lib64/python3.6/site-packages/psycopg2/init.py", line 127, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: connection to server at "hippo-instance1-dvpn-0.hippo-pods" (10.244.1.195), port 5432 failed: FATAL: certificate authentication failed for user "_crunchyrepl"
I checked certificates itself, they look good, are generated by cert manager using local CA. I tried different common names and dns names, maybe somewhere there issue lays.
Environment
Please provide the following details:
- Platform: Kubernetes
- Platform Version: 1.22.8
- PGO Image Tag: ubi8-5.1.0-0
- Postgres Version 14
- Storage: rook-ceph, block storage
Steps to Reproduce
REPRO
Provide steps to get to the error condition:
- Start upgrade from 5.05 to 5.1
- Change images to the ones from version 5.1 in postgrescluster CRD.
- Observe relicas can not communicate with leader.
EXPECTED
- Replicas are able to communicate with leader, cluster is healthy
ACTUAL
- Replicas can not communicate with leader, cluster is unhealthy, only leader works
Logs
database container error from log:
2022-04-28 13:44:26,426 ERROR: Exception when working with leader Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/rewind.py", line 60, in check_leader_is_not_in_recovery with get_connection_cursor(connect_timeout=3, options='-c statement_timeout=2000', **conn_kwargs) as cur: File "/usr/lib64/python3.6/contextlib.py", line 81, in enter return next(self.gen) File "/usr/local/lib/python3.6/site-packages/patroni/postgresql/connection.py", line 44, in get_connection_cursor conn = psycopg.connect(**kwargs) File "/usr/lib64/python3.6/site-packages/psycopg2/init.py", line 127, in connect conn = _connect(dsn, connection_factory=connection_factory, **kwasync) psycopg2.OperationalError: connection to server at "hippo-instance1-dvpn-0.hippo-pods" (10.244.1.195), port 5432 failed: FATAL: certificate authentication failed for user "_crunchyrepl"
Pods from cluster namespace:
NAME READY STATUS RESTARTS AGE hippo-instance1-7sr7-0 3/4 Running 0 35m hippo-instance1-bx96-0 3/4 Running 0 35m hippo-instance1-dvpn-0 4/4 Running 0 2d hippo-pgbackrest-repo1-full-27517150--1-4xrx9 0/1 Completed 0 34h hippo-pgbackrest-repo1-full-27518590--1-5fmkz 0/1 Completed 0 10h hippo-pgbackrest-repo2-full-27517140--1-5rhzh 0/1 Error 0 34h hippo-pgbackrest-repo2-full-27517140--1-k9phj 0/1 Error 0 34h hippo-pgbackrest-repo2-full-27517140--1-pkdjg 0/1 Completed 0 34h hippo-pgbackrest-repo2-full-27518580--1-2hv25 0/1 Error 0 10h hippo-pgbackrest-repo2-full-27518580--1-c6qff 0/1 Completed 0 10h hippo-pgbackrest-repo2-full-27518580--1-jmswl 0/1 Error 0 10h hippo-pgbackrest-repo2-full-27518580--1-z7vgf 0/1 Error 0 10h hippo-pgbackrest-repo2-incr-27519060--1-fhrm5 0/1 Completed 0 174m hippo-pgbackrest-repo2-incr-27519120--1-t9jn5 0/1 Completed 0 114m hippo-pgbackrest-repo2-incr-27519180--1-blsnv 0/1 Completed 0 54m hippo-repo-host-0 2/2 Running 0 2d1h pgo-747d898c67-c2hcr 1/1 Running 0 2d1h pgo-upgrade-68b4797d7f-k8ppq 1/1 Running 0 2d1h
Leader describe pod:
Name: hippo-instance1-dvpn-0
Namespace: pgo
Priority: 0
Node:
NSS_WRAPPER_DIR="/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}"
NSS_WRAPPER_PASSWD="${NSS_WRAPPER_DIR}/passwd"
NSS_WRAPPER_GROUP="${NSS_WRAPPER_DIR}/group"
# create the nss_wrapper directory
mkdir -p "${NSS_WRAPPER_DIR}"
# grab the current user ID and group ID
USER_ID=$(id -u)
export USER_ID
GROUP_ID=$(id -g)
export GROUP_ID
# get copies of the passwd and group files
[[ -f "${NSS_WRAPPER_PASSWD}" ]] || cp "/etc/passwd" "${NSS_WRAPPER_PASSWD}"
[[ -f "${NSS_WRAPPER_GROUP}" ]] || cp "/etc/group" "${NSS_WRAPPER_GROUP}"
# if the username is missing from the passwd file, then add it
if [[ ! $(cat "${NSS_WRAPPER_PASSWD}") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then
echo "nss_wrapper: adding user"
passwd_tmp="${NSS_WRAPPER_DIR}/passwd_tmp"
cp "${NSS_WRAPPER_PASSWD}" "${passwd_tmp}"
sed -i "/${CRUNCHY_NSS_USERNAME}:x:/d" "${passwd_tmp}"
# needed for OCP 4.x because crio updates /etc/passwd with an entry for USER_ID
sed -i "/${USER_ID}:x:/d" "${passwd_tmp}"
printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${GROUP_ID}:${CRUNCHY_NSS_USER_DESC}:${HOME}:/bin/bash\n' >> "${passwd_tmp}"
envsubst < "${passwd_tmp}" > "${NSS_WRAPPER_PASSWD}"
rm "${passwd_tmp}"
else
echo "nss_wrapper: user exists"
fi
# if the username (which will be the same as the group name) is missing from group file, then add it
if [[ ! $(cat "${NSS_WRAPPER_GROUP}") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then
echo "nss_wrapper: adding group"
group_tmp="${NSS_WRAPPER_DIR}/group_tmp"
cp "${NSS_WRAPPER_GROUP}" "${group_tmp}"
sed -i "/${CRUNCHY_NSS_USERNAME}:x:/d" "${group_tmp}"
printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${CRUNCHY_NSS_USERNAME}\n' >> "${group_tmp}"
envsubst < "${group_tmp}" > "${NSS_WRAPPER_GROUP}"
rm "${group_tmp}"
else
echo "nss_wrapper: group exists"
fi
# export the nss_wrapper env vars
# define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper
NSS_WRAPPER_DIR="/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}"
NSS_WRAPPER_PASSWD="${NSS_WRAPPER_DIR}/passwd"
NSS_WRAPPER_GROUP="${NSS_WRAPPER_DIR}/group"
export LD_PRELOAD=/usr/lib64/libnss_wrapper.so
export NSS_WRAPPER_PASSWD="${NSS_WRAPPER_PASSWD}"
export NSS_WRAPPER_GROUP="${NSS_WRAPPER_GROUP}"
echo "nss_wrapper: environment configured"
State: Terminated
Reason: Completed
Exit Code: 0
Started: Tue, 26 Apr 2022 13:07:39 +0000
Finished: Tue, 26 Apr 2022 13:07:39 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j5vd9 (ro)
Containers: database: Container ID: docker://0a93bf1dd4ba135aba8161ac8d5621df741ce8ea279f1b022c6971e2d31f4cce Image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1 Image ID: docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-postgres@sha256:9d71b968a08e6b189051d4e8d64e2ea2c118ac60e4b7b301478bb0b4942de7af Port: 5432/TCP Host Port: 0/TCP Command: patroni /etc/patroni State: Running Started: Tue, 26 Apr 2022 13:07:40 +0000 Ready: True Restart Count: 0 Liveness: http-get https://:8008/liveness delay=3s timeout=5s period=10s #success=1 #failure=3 Readiness: http-get https://:8008/readiness delay=3s timeout=5s period=10s #success=1 #failure=3 Environment: PGDATA: /pgdata/pg14 PGHOST: /tmp/postgres PGPORT: 5432 KRB5_CONFIG: /etc/postgres/krb5.conf KRB5RCACHEDIR: /tmp PATRONI_NAME: hippo-instance1-dvpn-0 (v1:metadata.name) PATRONI_KUBERNETES_POD_IP: (v1:status.podIP) PATRONI_KUBERNETES_PORTS: - name: postgres port: 5432 protocol: TCP
PATRONI_POSTGRESQL_CONNECT_ADDRESS: $(PATRONI_NAME).hippo-pods:5432
PATRONI_POSTGRESQL_LISTEN: *:5432
PATRONI_POSTGRESQL_CONFIG_DIR: /pgdata/pg14
PATRONI_POSTGRESQL_DATA_DIR: /pgdata/pg14
PATRONI_RESTAPI_CONNECT_ADDRESS: $(PATRONI_NAME).hippo-pods:8008
PATRONI_RESTAPI_LISTEN: *:8008
PATRONICTL_CONFIG_FILE: /etc/patroni
LD_PRELOAD: /usr/lib64/libnss_wrapper.so
NSS_WRAPPER_PASSWD: /tmp/nss_wrapper/postgres/passwd
NSS_WRAPPER_GROUP: /tmp/nss_wrapper/postgres/group
Mounts:
/dev/shm from dshm (rw)
/etc/database-containerinfo from database-containerinfo (ro)
/etc/patroni from patroni-config (ro)
/etc/pgbackrest/conf.d from pgbackrest-config (ro)
/pgconf/tls from cert-volume (ro)
/pgdata from postgres-data (rw)
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-j5vd9 (ro)
replication-cert-copy:
Container ID: docker://5c62bbd1f3f5d4deee388d79db3d56c91aab772c3e237ba3a9b278152e84d81d
Image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
Image ID: docker-pullable://registry.developers.crunchydata.com/crunchydata/crunchy-postgres@sha256:9d71b968a08e6b189051d4e8d64e2ea2c118ac60e4b7b301478bb0b4942de7af
Port:
One of failing Replicas describe:
Name: hippo-instance1-7sr7-0
Namespace: pgo
Priority: 0
Node: combartekbroniewski-slot4/192.168.1.43
Start Time: Thu, 28 Apr 2022 13:19:02 +0000
Labels: controller-revision-hash=hippo-instance1-7sr7-868ccc8d54
postgres-operator.crunchydata.com/cluster=hippo
postgres-operator.crunchydata.com/data=postgres
postgres-operator.crunchydata.com/instance=hippo-instance1-7sr7
postgres-operator.crunchydata.com/instance-set=instance1
postgres-operator.crunchydata.com/patroni=hippo-ha
statefulset.kubernetes.io/pod-name=hippo-instance1-7sr7-0
Annotations: status:
{"conn_url":"postgres://hippo-instance1-7sr7-0.hippo-pods:5432/postgres","api_url":"https://hippo-instance1-7sr7-0.hippo-pods:8008/patroni...
Status: Running
IP: 10.244.2.67
IPs:
IP: 10.244.2.67
Controlled By: StatefulSet/hippo-instance1-7sr7
Init Containers:
postgres-startup:
Container ID: containerd://e9e47f98b12ff9e5aefcf96224fb84895a76feb000843d73daf31df6e2b9a0ac
Image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
Image ID: registry.developers.crunchydata.com/crunchydata/crunchy-postgres@sha256:9d71b968a08e6b189051d4e8d64e2ea2c118ac60e4b7b301478bb0b4942de7af
Port:
NSS_WRAPPER_DIR="/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}"
NSS_WRAPPER_PASSWD="${NSS_WRAPPER_DIR}/passwd"
NSS_WRAPPER_GROUP="${NSS_WRAPPER_DIR}/group"
# create the nss_wrapper directory
mkdir -p "${NSS_WRAPPER_DIR}"
# grab the current user ID and group ID
USER_ID=$(id -u)
export USER_ID
GROUP_ID=$(id -g)
export GROUP_ID
# get copies of the passwd and group files
[[ -f "${NSS_WRAPPER_PASSWD}" ]] || cp "/etc/passwd" "${NSS_WRAPPER_PASSWD}"
[[ -f "${NSS_WRAPPER_GROUP}" ]] || cp "/etc/group" "${NSS_WRAPPER_GROUP}"
# if the username is missing from the passwd file, then add it
if [[ ! $(cat "${NSS_WRAPPER_PASSWD}") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then
echo "nss_wrapper: adding user"
passwd_tmp="${NSS_WRAPPER_DIR}/passwd_tmp"
cp "${NSS_WRAPPER_PASSWD}" "${passwd_tmp}"
sed -i "/${CRUNCHY_NSS_USERNAME}:x:/d" "${passwd_tmp}"
# needed for OCP 4.x because crio updates /etc/passwd with an entry for USER_ID
sed -i "/${USER_ID}:x:/d" "${passwd_tmp}"
printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${GROUP_ID}:${CRUNCHY_NSS_USER_DESC}:${HOME}:/bin/bash\n' >> "${passwd_tmp}"
envsubst < "${passwd_tmp}" > "${NSS_WRAPPER_PASSWD}"
rm "${passwd_tmp}"
else
echo "nss_wrapper: user exists"
fi
# if the username (which will be the same as the group name) is missing from group file, then add it
if [[ ! $(cat "${NSS_WRAPPER_GROUP}") =~ ${CRUNCHY_NSS_USERNAME}:x:${USER_ID} ]]; then
echo "nss_wrapper: adding group"
group_tmp="${NSS_WRAPPER_DIR}/group_tmp"
cp "${NSS_WRAPPER_GROUP}" "${group_tmp}"
sed -i "/${CRUNCHY_NSS_USERNAME}:x:/d" "${group_tmp}"
printf '${CRUNCHY_NSS_USERNAME}:x:${USER_ID}:${CRUNCHY_NSS_USERNAME}\n' >> "${group_tmp}"
envsubst < "${group_tmp}" > "${NSS_WRAPPER_GROUP}"
rm "${group_tmp}"
else
echo "nss_wrapper: group exists"
fi
# export the nss_wrapper env vars
# define nss_wrapper directory and passwd & group files that will be utilized by nss_wrapper
NSS_WRAPPER_DIR="/tmp/nss_wrapper/${NSS_WRAPPER_SUBDIR}"
NSS_WRAPPER_PASSWD="${NSS_WRAPPER_DIR}/passwd"
NSS_WRAPPER_GROUP="${NSS_WRAPPER_DIR}/group"
export LD_PRELOAD=/usr/lib64/libnss_wrapper.so
export NSS_WRAPPER_PASSWD="${NSS_WRAPPER_PASSWD}"
export NSS_WRAPPER_GROUP="${NSS_WRAPPER_GROUP}"
echo "nss_wrapper: environment configured"
State: Terminated
Reason: Completed
Exit Code: 0
Started: Thu, 28 Apr 2022 13:19:14 +0000
Finished: Thu, 28 Apr 2022 13:19:14 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mk5w5 (ro)
Containers: database: Container ID: containerd://59773b768a733beff4f48bde8e48cbed9fc76389d655362c5a0a11d2c1f054cc Image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1 Image ID: registry.developers.crunchydata.com/crunchydata/crunchy-postgres@sha256:9d71b968a08e6b189051d4e8d64e2ea2c118ac60e4b7b301478bb0b4942de7af Port: 5432/TCP Host Port: 0/TCP Command: patroni /etc/patroni State: Running Started: Thu, 28 Apr 2022 13:19:15 +0000 Ready: False Restart Count: 0 Liveness: http-get https://:8008/liveness delay=3s timeout=5s period=10s #success=1 #failure=3 Readiness: http-get https://:8008/readiness delay=3s timeout=5s period=10s #success=1 #failure=3 Environment: PGDATA: /pgdata/pg14 PGHOST: /tmp/postgres PGPORT: 5432 KRB5_CONFIG: /etc/postgres/krb5.conf KRB5RCACHEDIR: /tmp PATRONI_NAME: hippo-instance1-7sr7-0 (v1:metadata.name) PATRONI_KUBERNETES_POD_IP: (v1:status.podIP) PATRONI_KUBERNETES_PORTS: - name: postgres port: 5432 protocol: TCP
PATRONI_POSTGRESQL_CONNECT_ADDRESS: $(PATRONI_NAME).hippo-pods:5432
PATRONI_POSTGRESQL_LISTEN: *:5432
PATRONI_POSTGRESQL_CONFIG_DIR: /pgdata/pg14
PATRONI_POSTGRESQL_DATA_DIR: /pgdata/pg14
PATRONI_RESTAPI_CONNECT_ADDRESS: $(PATRONI_NAME).hippo-pods:8008
PATRONI_RESTAPI_LISTEN: *:8008
PATRONICTL_CONFIG_FILE: /etc/patroni
LD_PRELOAD: /usr/lib64/libnss_wrapper.so
NSS_WRAPPER_PASSWD: /tmp/nss_wrapper/postgres/passwd
NSS_WRAPPER_GROUP: /tmp/nss_wrapper/postgres/group
Mounts:
/dev/shm from dshm (rw)
/etc/database-containerinfo from database-containerinfo (ro)
/etc/patroni from patroni-config (ro)
/etc/pgbackrest/conf.d from pgbackrest-config (ro)
/pgconf/tls from cert-volume (ro)
/pgdata from postgres-data (rw)
/tmp from tmp (rw)
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-mk5w5 (ro)
replication-cert-copy:
Container ID: containerd://5c11252359b2c73c79ac78eaf9e308293881043c780127d2ccc6c753edfb955c
Image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
Image ID: registry.developers.crunchydata.com/crunchydata/crunchy-postgres@sha256:9d71b968a08e6b189051d4e8d64e2ea2c118ac60e4b7b301478bb0b4942de7af
Port:
Normal Scheduled 39m default-scheduler Successfully assigned pgo/hippo-instance1-7sr7-0 to combartekbroniewski-slot4 Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1" already present on machine Normal Created 38m kubelet Created container postgres-startup Normal Started 38m kubelet Started container postgres-startup Normal Started 38m kubelet Started container nss-wrapper-init Normal Created 38m kubelet Created container nss-wrapper-init Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1" already present on machine Normal Created 38m kubelet Created container replication-cert-copy Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1" already present on machine Normal Created 38m kubelet Created container database Normal Started 38m kubelet Started container database Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1" already present on machine Normal Started 38m kubelet Started container replication-cert-copy Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-0" already present on machine Normal Created 38m kubelet Created container pgbackrest Normal Started 38m kubelet Started container pgbackrest Normal Pulled 38m kubelet Container image "registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-0" already present on machine Normal Created 38m kubelet Created container pgbackrest-config Normal Started 38m kubelet Started container pgbackrest-config Warning Unhealthy 3m46s (x241 over 38m) kubelet Readiness probe failed: HTTP probe failed with statuscode: 503
Additional Information
Please provide any additional information that may be helpful.
Cluster CRD:
apiVersion: v1
items:
- apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"postgres-operator.crunchydata.com/v1beta1","kind":"PostgresCluster","metadata":{"annotations":{},"name":"hippo","namespace":"pgo"},"spec":{"backups":{"pgbackrest":{"image":"registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.36-1","repos":[{"name":"repo1","volume":{"volumeClaimSpec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}}}}}]}},"customTLSSecret":{"name":"hippo-tls"},"image":"registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-14.2-0","instances":[{"dataVolumeClaimSpec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}}},"name":"instance1","replicas":2}],"postgresVersion":14}}
creationTimestamp: "2022-03-22T01:26:11Z"
finalizers:
- postgres-operator.crunchydata.com/finalizer
generation: 44
name: hippo
namespace: pgo
resourceVersion: "103210973"
uid: 4ff648ca-9f81-4f64-aeca-a30fd586f2b0
spec:
backups:
pgbackrest:
configuration:
- secret:
name: pgo-s3-creds
global:
repo1-retention-full: "7"
repo1-retention-full-type: time
repo2-host-cert-file: /run/secrets/kubernetes.io/serviceaccount/ca.crt
repo2-path: /pgo/hippo/repo2
repo2-retention-full: "14"
repo2-retention-full-type: time
repo2-s3-uri-style: path
repo2-storage-verify-tls: "y"
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi8-2.38-0
manual:
options:
- --type=full
repoName: repo2
repos:
- name: repo1
schedules:
full: 10 3 * * *
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
- name: repo2
s3:
bucket: translison-db-backup
endpoint: <wiped out>
region: pl-rack-1
schedules:
full: 0 3 * * *
incremental: 0 */1 * * *
customReplicationTLSSecret:
name: hippo-replication-tls
optional: false
customTLSSecret:
name: hippo-tls
optional: false
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:ubi8-14.2-1
instances:
- affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
postgres-operator.crunchydata.com/cluster: hippo
postgres-operator.crunchydata.com/instance-set: instance1
topologyKey: kubernetes.io/hostname
dataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
minAvailable: 2
name: instance1
replicas: 3
port: 5432
postgresVersion: 14
users:
- name: postgres
status:
conditions:
- lastTransitionTime: "2022-04-04T14:42:31Z"
message: pgBackRest replica create repo is ready for backups
observedGeneration: 44
reason: StanzaCreated
status: "True"
type: PGBackRestReplicaRepoReady
- lastTransitionTime: "2022-04-04T14:44:29Z"
message: pgBackRest replica creation is now possible
observedGeneration: 44
reason: RepoBackupComplete
status: "True"
type: PGBackRestReplicaCreate
- lastTransitionTime: "2022-04-26T12:21:39Z"
message: pgBackRest dedicated repository host is ready
observedGeneration: 44
reason: RepoHostReady
status: "True"
type: PGBackRestRepoHostReady
- lastTransitionTime: "2022-04-05T12:39:25Z"
message: Manual backup completed successfully
observedGeneration: 40
reason: ManualBackupComplete
status: "True"
type: PGBackRestManualBackupSuccessful
databaseRevision: 559678bf8f
instances:
- name: instance1
readyReplicas: 1
replicas: 3
updatedReplicas: 2
monitoring:
exporterConfiguration: 559c4c97d6
observedGeneration: 44
patroni:
systemIdentifier: "7077734709681266763"
pgbackrest:
manualBackup:
completionTime: "2022-04-05T12:39:24Z"
finished: true
id: "3"
startTime: "2022-04-05T12:36:00Z"
succeeded: 1
repoHost:
apiVersion: apps/v1
kind: StatefulSet
ready: true
repos:
- bound: true
name: repo1
replicaCreateBackupComplete: true
stanzaCreated: true
volume: pvc-4c44805c-b618-408c-9a78-ba77695206dd
- name: repo2
repoOptionsHash: 7549974f45
stanzaCreated: true
scheduledBackups:
- completionTime: "2022-04-27T03:11:25Z"
cronJobName: hippo-pgbackrest-repo1-full
repo: repo1
startTime: "2022-04-27T03:10:00Z"
succeeded: 1
type: full
- completionTime: "2022-04-28T03:11:00Z"
cronJobName: hippo-pgbackrest-repo1-full
repo: repo1
startTime: "2022-04-28T03:10:00Z"
succeeded: 1
type: full
- completionTime: "2022-04-27T03:05:12Z"
cronJobName: hippo-pgbackrest-repo2-full
failed: 2
repo: repo2
startTime: "2022-04-27T03:00:00Z"
succeeded: 1
type: full
- completionTime: "2022-04-28T03:05:11Z"
cronJobName: hippo-pgbackrest-repo2-full
failed: 3
repo: repo2
startTime: "2022-04-28T03:00:00Z"
succeeded: 1
type: full
- completionTime: "2022-04-28T13:00:16Z"
cronJobName: hippo-pgbackrest-repo2-incr
repo: repo2
startTime: "2022-04-28T13:00:00Z"
succeeded: 1
type: incr
- completionTime: "2022-04-28T14:00:17Z"
cronJobName: hippo-pgbackrest-repo2-incr
repo: repo2
startTime: "2022-04-28T14:00:00Z"
succeeded: 1
type: incr
- completionTime: "2022-04-28T15:00:19Z"
cronJobName: hippo-pgbackrest-repo2-incr
repo: repo2
startTime: "2022-04-28T15:00:00Z"
succeeded: 1
type: incr
proxy:
pgBouncer:
postgresRevision: 5c9966f6bc
usersRevision: 786cb8ff8c
kind: List
metadata:
resourceVersion: ""
selfLink: ""
Certificates:
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: hippo-replication-tls
namespace: pgo
spec:
secretName: hippo-replication-tls
duration: 2160h # 90d
renewBefore: 360h # 15d
commonName: hippo-primary
dnsNames:
- hippo-primary.pgo.svc
- hippo-ha.pgo.svc
- hippo-replicas.pgo.svc
issuerRef:
name: local-ca-issuer
kind: ClusterIssuer
group: cert-manager.io
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: hippo-tls
namespace: pgo
spec:
secretName: hippo-tls
duration: 2160h # 90d
renewBefore: 360h # 15d
commonName: hippo-primary
dnsNames:
- hippo-primary.pgo.svc
- hippo-ha.pgo.svc
issuerRef:
name: local-ca-issuer
kind: ClusterIssuer
group: cert-manager.io
I think this is not documented, but commonName of the replication certificate has to be "_crunchyrepl".
@bbroniewski Thank you for following up, our team will plan to look into improving the documentation related to manual certificate creation. Just to be clear, with the updated common name, is everything now working as expected?
@tjmoore4 yes, but upgrade never ends, so I killed it finally. Certificates, after entioned correction works good.
@bbroniewski glad to hear all is now working with your certificates.
As @tjmoore4 mentioned, we plan to update our documentation to better define any requirements for custom generated certs.
Thank you for your feedback and thanks for using PGO!
Thanks for your help noting the gap in the docs, @bbroniewski -- we've merged in a change to fix that, so I'm going to close this ticket now. If you run into any other bumps, please let us know!
Thanks for your help noting the gap in the docs, @bbroniewski -- we've merged in a change to fix that, so I'm going to close this ticket now. If you run into any other bumps, please let us know!
Where in the docs? I'm having this same issue.