postgres-operator-examples
postgres-operator-examples copied to clipboard
configmap hippo-ssh-config is missing
I am not sure if this is a bug, but I just upgraded from 5.0.2 to 5.0.3 using helm upgrade command. However, the first pod is trying to be rescheduled but fails with some missing config maps. I didn't change anything on my config so far
pe Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40s default-scheduler Successfully assigned postgresql/hippo-instance1-xqp6-0 to k8s-production-fr-standard-node-b4452d
Warning FailedMount 8s (x7 over 40s) kubelet MountVolume.SetUp failed for volume "ssh" : [configmap "hippo-ssh-config" not found, secret "hippo-ssh" not found]
Warning FailedMount 8s (x7 over 40s) kubelet MountVolume.SetUp failed for volume "pgbackrest-config" : configmap references non-existent config key: pgbackrest_instance.conf
Not sure is this is an undetected issue in the upgrade path.
@ZuSe can you provide your PostgresCluster spec?
I am specifically curious about your pgBackRest repo configuration.
Thanks!
Hi @andrewlecuyer, sure. See below pls. As I said, I didn't touch anything except for pgbouncer service type.
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: postgres-operator.crunchydata.com/v1beta1
kind: PostgresCluster
metadata:
annotations:
meta.helm.sh/release-name: hippo
meta.helm.sh/release-namespace: postgresql
creationTimestamp: "2021-09-07T14:26:35Z"
finalizers:
- postgres-operator.crunchydata.com/finalizer
generation: 14
labels:
app.kubernetes.io/managed-by: Helm
name: hippo
namespace: postgresql
resourceVersion: "24742400499"
uid: d69be417-aecf-42e8-aafb-04dc85f89bb8
spec:
backups:
pgbackrest:
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:centos8-2.35-0
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: csi-cinder-classic
- name: repo2
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 100Gi
storageClassName: csi-cinder-classic
image: registry.developers.crunchydata.com/crunchydata/crunchy-postgres:centos8-13.4-1
instances:
- dataVolumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 50Gi
storageClassName: csi-cinder-high-speed
name: instance1
replicas: 2
resources:
limits:
cpu: 2
memory: 4Gi
patroni:
dynamicConfiguration:
postgresql:
parameters:
max_parallel_workers: 2
max_worker_processes: 2
shared_buffers: 1GB
work_mem: 32MB
leaderLeaseDurationSeconds: 30
port: 8008
syncPeriodSeconds: 10
port: 5432
postgresVersion: 13
proxy:
pgBouncer:
config:
global:
ignore_startup_parameters: extra_float_digits,ssl_renegotiation_limit
pool_mode: transaction
image: registry.developers.crunchydata.com/crunchydata/crunchy-pgbouncer:centos8-1.15-3
port: 5432
replicas: 1
resources:
limits:
cpu: 200m
memory: 256Mi
service:
type: LoadBalancer
users:
- name: postgres
- databases:
- account_service_production
- content_service_production
- fhir_r5_production
- fileserver_production
- matomo_production
- poll_service_production
- reff_service_production
- utility_service_production
name: iatros
options: CREATEDB CREATEROLE
status:
conditions:
- lastTransitionTime: "2021-10-05T20:28:30Z"
message: pgBackRest dedicated repository host is ready
observedGeneration: 14
reason: RepoHostReady
status: "True"
type: PGBackRestRepoHostReady
- lastTransitionTime: "2021-09-07T14:28:03Z"
message: pgBackRest replica create repo is ready for backups
observedGeneration: 12
reason: StanzaCreated
status: "True"
type: PGBackRestReplicaRepoReady
- lastTransitionTime: "2021-09-07T14:28:52Z"
message: pgBackRest replica creation is now possible
observedGeneration: 12
reason: RepoBackupComplete
status: "True"
type: PGBackRestReplicaCreate
- lastTransitionTime: "2021-10-05T20:28:04Z"
message: Deployment has minimum availability.
observedGeneration: 14
reason: MinimumReplicasAvailable
status: "True"
type: ProxyAvailable
databaseRevision: 685ff8ffb8
instances:
- name: instance1
readyReplicas: 1
replicas: 2
updatedReplicas: 1
monitoring:
exporterConfiguration: 559c4c97d6
observedGeneration: 14
patroni:
systemIdentifier: "7005198412650696782"
pgbackrest:
repoHost:
apiVersion: apps/v1
kind: StatefulSet
ready: true
repos:
- bound: true
name: repo1
replicaCreateBackupComplete: true
stanzaCreated: true
- bound: true
name: repo2
stanzaCreated: true
proxy:
pgBouncer:
postgresRevision: 694b7b5f67
readyReplicas: 1
replicas: 1
usersRevision: 67886fd468
Are you using Helm to install PGO as well as to create the PostgresCluster?
In this case, which was specifically upgraded to v5.0.3
? In other words, did you run helm upgrade
for both the PGO install itself, as well as for the PostgresCluster?
@andrewlecuyer
I did it for both. First PGO, then Cluster
I think that is enough to just delete replica and pgbackrest-host StatefulSets and the operator will recreate them correctly. We had the same issue from 5.0.2 to 5.0.4 and just deleting pgbackrest-host StatefulSets triggered the operator to update the postgres instance statefulsets
Hello, I'm looking into this and I think I may have an answer to at least part of this.
I installed pgo 5.0.2 through Helm, and I created a cluster -- but I noticed that the hippo-ssh-config
configMap was not present. I looked through some old docs and examples and added a field:
spec:
backups:
pgBackRest:
repoHost:
dedicated: {}
And that kicked off the creation and mounting of that cm & secret. If I delete that field and upgrade pgo to 5.0.3, it's fine, so the problem is with that repoHost area. Or at least, that's what I first thought.
But then I checked the operator logs and it was complaining:
time="2022-10-17T21:02:12Z" level=error msg="reconciling repository host"
error="StatefulSet.apps \"cluster-repo-host\" is invalid: spec: Forbidden:
updates to statefulset spec for fields other than 'replicas', 'template',
'updateStrategy', 'persistentVolumeClaimRetentionPolicy' and 'minReadySeconds'
are forbidden" ...
OK, so: there's no repo-host statefulset created in 5.0.2 unless the backups.pgbackrest.repoHost
object is filled in. And once there is that statefulset, then updates out of 5.0.2 are going to run into an error because >5.0.2, statefulset has a topology spread constraint, which can't be updated.
As @cr1cr1 pointed out, a solution here is to delete the sts that can't be updated, which will unblock the operator, which will then create the missing cm and secret (and also update the <clustername>-pgbackrest-config
configmap into the right form).
How do I feel about that solution? Well, that's actually our recommended solution in the docs: https://access.crunchydata.com/documentation/postgres-operator/v5/upgrade/kustomize/#upgrading-from-pgo-v5-0-2-and-below