postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Failed PGO Backup

Open idanl21 opened this issue 3 years ago • 2 comments

Hey Crunchy! i'm using your Pgo operator on my EKS env. as part of our work we created backups using your PGO backup to s3 bucket. the backup ConfigMap definition :

apiVersion: v1 data: pgo-001-pgbackrest-incr: '{"version":"v1","name":"pgo-001-pgbackrest-incr","cluster":"pgo-001","created":"2022-02-02T12:55:43Z","schedule":"0 */4 * * *","namespace":"pgo","type":"pgbackrest","pgbackrest":{"label":"pg-cluster=pgo-001,name=pgo-001,deployment-name=pgo-001","container":"database","type":"incr"},"policy":{}}' kind: ConfigMap metadata: manager: kubectl-create operation: Update name: pgo-001-pgbackrest-incr-2 namespace: pgo

it worked for couple of months but in the last few days it started to fail with that error:

time="2022-06-07T08:32:33Z" level=info msg="output=[]" time="2022-06-07T08:32:33Z" level=info msg="stderr=[WARN: option 'repo1-retention-full' is not set for 'repo1-retention-full-type=count', the repository may run out of space\n HINT: to retain full backups indefinitely (without warning), set option 'repo1-retention-full' to the maximum.\nWARN: backup

when i saw that error i deleted some old backups on the bucket , now im getting this error :

time="2022-06-07T08:32:33Z" level=info msg="output=[]" time="2022-06-07T08:32:33Z" level=info msg="stderr=[WARN: option 'repo1-retention-full' is not set for 'repo1-retention-full-type=count', the repository may run out of space\n HINT: to retain full backups indefinitely (without warning), set option 'repo1-retention-full' to the maximum.\nWARN: backup '20211017-093957F_20220206-000005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-200005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-160005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-120015I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-080004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-040004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220205-000005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-200005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-160004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-120005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-080004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-040005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220204-000005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-200004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-160015I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-120004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-080004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-040004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220203-000004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-200005I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-160016I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-100016I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-080004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-060004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-040004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-020004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220202-000004I' missing manifest removed from backup.info\nWARN: backup '20211017-093957F_20220201-220004I' missing manifest removed from backup.info\nWARN:

on all the backups that i deleted....

somehow when i ran the command pgo show backup pgo-001 -n pgo im getting Error: Do: Get "https://localhost:8443/backrest/pgo-001?version=4.7.0&selector=&namespace=pgo": context deadline exceeded (Client.Timeout exceeded while awaiting headers) Error: Get "https://localhost:8443/backrest/pgo-001?version=4.7.0&selector=&namespace=pgo": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

(only on this command, pgo show cluster pgo-001 -n pgo works)

someone ever faced that issue ? using Eks crunchy-postgres-ha:centos8-13.3-4.7.0 backup to aws s3

Thanks !!

idanl21 avatar Jun 07 '22 09:06 idanl21

My pgo-backrest-shared-repo pod logs :

k logs pgo-001-backrest-shared-repo-5c496f4b46-2zjg4  -n pgo -f
NWRAP_ERROR(1) - nwrap_files_cache_reload: Unable to open '/tmp/nss_wrapper/pgbackrest-repo/passwd' readonly -1:No such file or directory
NWRAP_ERROR(1) - nwrap_files_getpwuid: Error loading passwd file
nss_wrapper: user exists
nss_wrapper: group exists
nss_wrapper: environment configured
Starting the pgBackRest repo
nss_wrapper: ssh configured

The pgBackRest repo has been started
WARNING: 'UsePAM no' is not supported in Fedora and may cause several problems.

idanl21 avatar Jun 07 '22 12:06 idanl21

when i saw that error i deleted some old backups on the bucket , now im getting this error

How exactly did you delete your backups?

When using pgBackRest as we do in PGO, backups should be expired (e.g. using expire command), and should not be manually deleted from the S3 bucket.

When using PGO v4, you can manually expire backups using the pgo delete backup command:

https://pgbackrest.org/command.html#command-expire

andrewlecuyer avatar Jul 13 '22 04:07 andrewlecuyer

Hi @idanl21!

We are currently reviewing some past issues, and see that there have not been any updates on this one since our last reply. I am therefore proceeding with closing this issue.

If you have any additional questions or issues regarding PGO and backups, please do not hesitate to open a new issue.

Additionally, if you have not already done so, I also recommend checking out the documentation for the latest version of Crunchy Postgres for Kubernetes, including the guide for upgrading from PGO v4 to v5.

  • https://access.crunchydata.com/documentation/postgres-operator/latest/
  • https://access.crunchydata.com/documentation/postgres-operator/latest/upgrade/v4tov5/

andrewlecuyer avatar Mar 21 '23 21:03 andrewlecuyer