Ability to delete and re-order pgbackrest repo's
Overview
Provide the ability to modify pgbackrest repo's after creation.
Use Case
I have two repos currently, one PV based (repo1), 1 Azure based (repo2), I want to get rid of the PV based one. Currently if I try and change them Postgres refuses to start.
Desired Behavior
pgbackrest repos are all removed at boot and new repo is created with new repo settings
Environment
Tell us about your environment:
Please provide the following details:
- Platform: AKS
- Platform Version: 1.21.7
- PGO Image Tag: ubi8-5.1.1-0
- Postgres Version 13
- Storage: Azure blob storage
- Number of Postgres clusters: 1
@trevor-primer it should be possible to modify your repos after cluster creation.
Can you provide a copy of your operator logs after attempting to modify the repos? Any logs from the various PG instance Pods, etc. that are unable to start would be great as well.
Additionally, if you could provide insight into the exact change you made (including an example PostgresCluster spec), that would be great as well. To clarify the change you did make, did you simply remove repo1 from your spec, leaving only repo2?
@trevor-primer just following-up to see if you're able to provide the additional details, logs, etc. requested in my last message.
Thanks!
I will try and reproduce again this week and paste operator logs. Sorry for the delay. Thanks!
Postgres pod errors
2022-07-25 15:02:28,876 INFO: No PostgreSQL configuration items changed, nothing to reload.
2022-07-25 15:02:28,888 WARNING: Postgresql is not running.
2022-07-25 15:02:28,888 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:02:28,892 INFO: pg_controldata:
pg_control version number: 1300
Catalog version number: 202007201
Database system identifier: 7124319653976735831
Database cluster state: shut down
pg_control last modified: Mon Jul 25 15:02:16 2022
Latest checkpoint location: 0/8000028
Latest checkpoint's REDO location: 0/8000028
Latest checkpoint's REDO WAL file: 000000020000000000000008
Latest checkpoint's TimeLineID: 2
Latest checkpoint's PrevTimeLineID: 2
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:624
Latest checkpoint's NextOID: 19855
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 478
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 0
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Mon Jul 25 15:02:16 2022
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/0
Min recovery ending loc's timeline: 0
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: on
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: a4f81d6519d75fb6c91d547082d2c38386ff5e4e107bd3ca3d7ba4b647926d3e
2022-07-25 15:02:28,901 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:02:29,352 INFO: starting as a secondary
2022-07-25 15:02:29,516 INFO: postmaster pid=94
2022-07-25 15:02:29.520 UTC [94] LOG: pgaudit extension initialized
/tmp/postgres:5432 - no response
2022-07-25 15:02:29.533 UTC [94] LOG: redirecting log output to logging collector process
2022-07-25 15:02:29.533 UTC [94] HINT: Future log output will appear in directory "log".
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
2022-07-25 15:02:39,384 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:02:39,384 INFO: not healthy enough for leader race
2022-07-25 15:02:39,591 INFO: restarting after failure in progress
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
2022-07-25 15:02:49,383 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:02:49,384 INFO: not healthy enough for leader race
2022-07-25 15:02:49,384 INFO: restarting after failure in progress
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
2022-07-25 15:02:59,383 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:02:59,384 INFO: not healthy enough for leader race
2022-07-25 15:02:59,384 INFO: restarting after failure in progress
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
/tmp/postgres:5432 - rejecting connections
2022-07-25 15:03:09,383 INFO: Lock owner: None; I am config-service-repotest-pgha1-jbrz-0
2022-07-25 15:03:09,384 INFO: not healthy enough for leader race
2022-07-25 15:03:09,384 INFO: restarting after failure in progress
Operator logs
time="2022-07-25T15:13:17Z" level=debug msg="replaced configuration" file="internal/patroni/api.go:149" func=patroni.Executor.ReplaceConfiguration name=config-service-repotest namespace=config-service-repotest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster stderr= stdout="Not changed\n" version=5.1.1-0
time="2022-07-25T15:13:18Z" level=debug msg="reconciled instance" file="internal/controller/postgrescluster/instance.go:1129" func="postgrescluster.(*Reconciler).reconcileInstance" instance=config-service-repotest-pgha1-jbrz name=config-service-repotest namespace=config-service-repotest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
time="2022-07-25T15:13:18Z" level=debug msg="reconciled instance set" file="internal/controller/postgrescluster/instance.go:1025" func="postgrescluster.(*Reconciler).scaleUpInstances" instance-set=pgha1 name=config-service-repotest namespace=config-service-repotest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
time="2022-07-25T15:13:18Z" level=debug msg="skipping SSH reconciliation, no repo hosts configured" file="internal/controller/postgrescluster/pgbackrest.go:1862" func="postgrescluster.(*Reconciler).reconcilePGBackRestConfig" name=config-service-repotest namespace=config-service-repotest reconcileResource=repoConfig reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
time="2022-07-25T15:13:18Z" level=debug msg="Could not find a pod with a writable database container." file="internal/controller/postgrescluster/postgres.go:729" func="postgrescluster.(*Reconciler).reconcileDatabaseInitSQL" name=config-service-repotest namespace=config-service-repotest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
time="2022-07-25T15:13:18Z" level=debug msg="reconciled cluster" file="internal/controller/postgrescluster/controller.go:313" func="postgrescluster.(*Reconciler).Reconcile" name=config-service-repotest namespace=config-service-repotest reconciler group=postgres-operator.crunchydata.com reconciler kind=PostgresCluster version=5.1.1-0
Tried two scenarios
- Deleted repo1 and and left Azure as repo2
- Deleted repo1 and renamed Azure as repo1
Same result for both
Here is the config I started with
backups:
pgbackrest:
configuration:
- secret:
name: pgo-azure-creds
global:
repo1-path: /
repo1-retention-full: "14"
repo1-retention-full-type: time
manual:
options:
- --type=full
repoName: repo2
repos:
- name: repo1
volume:
volumeClaimSpec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1500Gi
- name: repo2
azure:
container: config-service-backups-repotest
schedules:
full: "0 1 * * *"
incremental: "0 */8 * * *"
@trevor-primer, I am unable to reproduce the behavior you've described... Can you send the manifest you are using before and after deleting the repo so we know exactly what you are changing? Can you try using the latest PGO, pgBackRest, etc, and see if you still see the issue?
Closing as unable to reproduce.