postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

Manual database backup error and duplicated backup

Open Eric-zch opened this issue 1 year ago • 11 comments

Overview

I am testing Crunchy Postgres database backup on Openshift, and I found there are 2 results I do not understand. (1) Multiple backup Pods were started and some Pods failed quickly. (2) The backup command was executed twice after I issued one backup command.

Environment

  • Platform: OpenShift
  • Platform Version: 4.10.53
  • PGO Image Tag: ubi8-5.3.0-0
  • Postgres Version: ubi8-14.6-2
  • Pgbackrest: 2.41

Steps to Reproduce

Issue a manual database backup command:

[zhaoch@x86_64]$oc pgo backup pnst --repoName=repo1 --options="--type=full"
postgresclusters/pnst backup initiated
[zhaoch@x86_64]$
[zhaoch@x86_64]$oc get po
NAME                             READY   STATUS      RESTARTS   AGE
asb-activemq5-6bff45c5b8-j9fhf   1/1     Running     0          2d18h
pnst-backup-996l-m7p5z           0/1     Completed   0          3h27m
pnst-backup-pvl5-bdxkx           1/1     Running     0          4s
pnst-backup-pvl5-dc8gr           0/1     Error       0          76s
pnst-backup-pvl5-gcg9f           0/1     Error       0          55s
pnst-backup-pvl5-z78f8           0/1     Error       0          45s
pnst-backup-tmjl-4f4jr           0/1     Completed   0          76s
pnst-instance1-69lq-0            5/5     Running     0          4m31s
pnst-instance1-fv42-0            5/5     Running     0          6m16s
pnst-pgbouncer-647dcd786-mjsw9   2/2     Running     0          6m17s
pnst-repo-host-0                 2/2     Running     0          6m16s
[zhaoch@x86_64]$

There were 3 backup Pods (pnst-backup-pvl5-dc8gr,pnst-backup-pvl5-gcg9f,pnst-backup-pvl5-z78f8) with error.

[zhaoch@x86_64]$oc logs -f pnst-backup-pvl5-dc8gr 
time="2023-03-23T05:00:53Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-23T05:00:53Z" level=info msg="debug flag set to false"
time="2023-03-23T05:00:53Z" level=info msg="backrest backup command requested"
time="2023-03-23T05:00:53Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-23T05:00:53Z" level=info msg="output=[]"
time="2023-03-23T05:00:53Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-23T05:00:53Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@x86_64]$oc logs -f pnst-backup-pvl5-gcg9f
time="2023-03-23T05:00:57Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-23T05:00:57Z" level=info msg="debug flag set to false"
time="2023-03-23T05:00:57Z" level=info msg="backrest backup command requested"
time="2023-03-23T05:00:57Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-23T05:00:57Z" level=info msg="output=[]"
time="2023-03-23T05:00:57Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-23T05:00:57Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@x86_64]$oc logs -f pnst-backup-pvl5-z78f8 
time="2023-03-23T05:01:07Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-23T05:01:07Z" level=info msg="debug flag set to false"
time="2023-03-23T05:01:08Z" level=info msg="backrest backup command requested"
time="2023-03-23T05:01:08Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-23T05:01:08Z" level=info msg="output=[]"
time="2023-03-23T05:01:08Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-23T05:01:08Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@x86_64]$

The strange thing is: the full backup was excuted twice with 2 Pods (pnst-backup-tmjl-4f4jr, pnst-backup-pvl5-bdxkx)

[zhaoch@x86_64]$oc pgo show backup pnst
stanza: db
    status: ok
    cipher: none

    db (current)
        wal archive min/max (14): 000000010000000000000001/00000008000000000000002D

    ......

        full backup: 20230323-050054F
            timestamp start/stop: 2023-03-23 05:00:54 / 2023-03-23 05:01:29
            wal start/stop: 00000008000000000000002B / 00000008000000000000002B
            database size: 61.5MB, database backup size: 61.5MB
            repo1: backup set size: 6.6MB, backup size: 6.6MB

        full backup: 20230323-050150F
            timestamp start/stop: 2023-03-23 05:01:50 / 2023-03-23 05:02:00
            wal start/stop: 00000008000000000000002D / 00000008000000000000002D
            database size: 61.6MB, database backup size: 61.6MB
            repo1: backup set size: 6.6MB, backup size: 6.6MB

The same thing happens from incremental backup too.

I did not have such a duplicated database backup problem when I was on pgo 5.2.0

Eric-zch avatar Mar 23 '23 05:03 Eric-zch

Hey @Eric-zch , sorry you're having trouble. Those backups do seem to be failing because pgBackRest is already in the middle of a backup. This makes sense as you're seeing two backups get created at the same time. So the ultimate question is why are two backups being attempted?

When this occurs, do you see multiple backup jobs get created (oc get jobs)? Can you get the operator logs for when the issue occurs?

I see that you are using the PGO client. What version of the CLI are you using? What happens if you attempt to do a one-off backup as outlined in the following doc?

https://access.crunchydata.com/documentation/postgres-operator/v5/tutorial/backup-management/

dsessler7 avatar Mar 28 '23 01:03 dsessler7

Hi @dsessler7

For our daily scheduled backup, there is no such issue.

Yes, I used pgo client, I think it is user friendly compared with the old way of backup.

pgo client version

[zhaoch@~]$oc pgo version
Client Version: v0.2.0
Operator Version: v5.3.0
[zhaoch@~]$

Backup command:

oc pgo backup pnst --repoName=repo1 --options="--type=full"

There were 2 jobs (pnst-backup-2p8h,pnst-backup-vbrw) kicked off for the backup.

[zhaoch@~]$oc get job
NAME                       COMPLETIONS   DURATION   AGE
pnst-backup-2p8h           1/1           38s        107s
pnst-backup-gx8j           1/1           115s       6d19h
pnst-backup-vbrw           1/1           88s        107s
pnst-repo1-full-27995040   1/1           57s        5d2h
pnst-repo1-incr-27999420   1/1           14s        2d1h
pnst-repo1-incr-28000860   1/1           15s        25h
pnst-repo1-incr-28002300   1/1           15s        62m
[zhaoch@~]$

There were 5 pods started for this backup. 3 error pods: pnst-backup-vbrw-6lzgc pnst-backup-vbrw-7rmrm pnst-backup-vbrw-z74dr

2 successful pods: pnst-backup-2p8h-z89sd pnst-backup-vbrw-ccqp6

[zhaoch@~]$oc get po
NAME                             READY   STATUS      RESTARTS   AGE
asb-activemq5-6bff45c5b8-j9fhf   1/1     Running     0          9d
pnst-backup-2p8h-z89sd           0/1     Completed   0          3m7s
pnst-backup-gx8j-zf42g           0/1     Completed   0          6d19h
pnst-backup-vbrw-6lzgc           0/1     Error       0          2m57s
pnst-backup-vbrw-7rmrm           0/1     Error       0          2m37s
pnst-backup-vbrw-ccqp6           0/1     Completed   0          2m15s
pnst-backup-vbrw-z74dr           0/1     Error       0          3m7s
pnst-instance1-hjkm-0            5/5     Running     0          6d18h
pnst-instance1-tw8w-0            5/5     Running     0          5d20h
pnst-pgbouncer-647dcd786-4t2p7   2/2     Running     0          6d19h
pnst-repo-host-0                 2/2     Running     0          6d19h
pnst-repo1-full-27995040-9ntdd   0/1     Completed   0          5d2h
pnst-repo1-incr-27999420-w2b2v   0/1     Completed   0          2d1h
pnst-repo1-incr-28000860-zzfdz   0/1     Completed   0          25h
pnst-repo1-incr-28002300-9kwsm   0/1     Completed   0          63m
[zhaoch@~]$

Backup Pod messages:

[zhaoch@~]$oc logs -f pnst-backup-vbrw-6lzgc 
time="2023-03-30T02:00:49Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:00:49Z" level=info msg="debug flag set to false"
time="2023-03-30T02:00:49Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:00:49Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:00:49Z" level=info msg="output=[]"
time="2023-03-30T02:00:49Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:00:49Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$oc logs -f pnst-backup-vbrw-7rmrm
time="2023-03-30T02:01:09Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:01:09Z" level=info msg="debug flag set to false"
time="2023-03-30T02:01:09Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:01:09Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:01:09Z" level=info msg="output=[]"
time="2023-03-30T02:01:09Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:01:09Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$oc logs -f pnst-backup-vbrw-z74dr
time="2023-03-30T02:00:38Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:00:38Z" level=info msg="debug flag set to false"
time="2023-03-30T02:00:39Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:00:39Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:00:40Z" level=info msg="output=[]"
time="2023-03-30T02:00:40Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:00:40Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$
[zhaoch@~]$oc logs -f pnst-backup-2p8h-z89sd
time="2023-03-30T02:00:39Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:00:39Z" level=info msg="debug flag set to false"
time="2023-03-30T02:00:39Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:00:39Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:01:12Z" level=info msg="output=[]"
time="2023-03-30T02:01:12Z" level=info msg="stderr=[]"
time="2023-03-30T02:01:12Z" level=info msg="crunchy-pgbackrest ends"
[zhaoch@~]$oc logs -f pnst-backup-vbrw-ccqp6
time="2023-03-30T02:01:31Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:01:31Z" level=info msg="debug flag set to false"
time="2023-03-30T02:01:31Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:01:31Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:02:03Z" level=info msg="output=[]"
time="2023-03-30T02:02:03Z" level=info msg="stderr=[]"
time="2023-03-30T02:02:03Z" level=info msg="crunchy-pgbackrest ends"
[zhaoch@~]$

Backup history:

[zhaoch@~]$oc pgo show backup pnst
stanza: db
    status: ok
    cipher: none
    ......
        full backup: 20230330-020040F
            timestamp start/stop: 2023-03-30 02:00:40 / 2023-03-30 02:01:12
            wal start/stop: 0000000700000005000000A5 / 0000000700000005000000A6
            database size: 1.1GB, database backup size: 1.1GB
            repo1: backup set size: 92.9MB, backup size: 92.9MB

        full backup: 20230330-020132F
            timestamp start/stop: 2023-03-30 02:01:32 / 2023-03-30 02:02:03
            wal start/stop: 0000000700000005000000A8 / 0000000700000005000000A8
            database size: 1.1GB, database backup size: 1.1GB
            repo1: backup set size: 93.0MB, backup size: 93.0MB

Messages got from Operator pod during the backup time:

time="2023-03-30T02:00:22Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=fe3bb221-9e37-428b-877d-bcd4a9b249f3 stderr="2023-03-30 02:00:22,273 - ERROR - Unexpected error from Kubernetes API\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 498, in wrapper\n    return func(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 937, in patch_or_create\n    raise e\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 931, in patch_or_create\n    return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 921, in _patch_or_create\n    ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 483, in wrapper\n    return getattr(self._core_v1_api, func)(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 419, in wrapper\n    return self._api_client.call_api(method, path, headers, body, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 388, in call_api\n    return self._handle_server_response(response, _preload_content)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 218, in _handle_server_response\n    raise k8s_client.rest.ApiException(http_resp=response)\npatroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)\nReason: Unprocessable Entity\nHTTP response headers: HTTPHeaderDict({'Audit-Id': 'b5be03eb-5be3-4b50-b6af-f3fb3276edfe', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'd8c76dee-5d73-4e14-9caa-711f3aec3afd', 'X-Kubernetes-Pf-Prioritylevel-Uid': '3d34cb27-5ead-4176-a761-38e53c5aabbb', 'Date': 'Thu, 30 Mar 2023 02:00:22 GMT', 'Content-Length': '741'})\nHTTP response body: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Endpoints \\\\\"smps-ha-config\\\\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes\",\"reason\":\"Invalid\",\"details\":{\"name\":\"smps-ha-config\",\"kind\":\"Endpoints\",\"causes\":[{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"}]},\"code\":422}\\n'\n\nError: Config modification aborted due to concurrent changes\n" stdout="--- \n+++ \n@@ -3,12 +3,13 @@\n   parameters:\n     archive_command: pgbackrest --stanza=db archive-push \"%p\"\n     archive_mode: 'on'\n-    archive_timeout: 0\n+    archive_timeout: 300\n     autovacuum: 'on'\n-    autovacuum_max_workers: 3\n-    autovacuum_vacuum_cost_limit: 1000\n+    autovacuum_analyze_scale_factor: 0.01\n+    autovacuum_analyze_threshold: 50\n+    autovacuum_max_workers: 5\n     autovacuum_vacuum_scale_factor: 0.02\n-    autovacuum_vacuum_threshold: 100\n+    autovacuum_vacuum_threshold: 50\n     idle_in_transaction_session_timeout: 600000\n     jit: 'off'\n     lc_messages: en_US.UTF8\n@@ -25,11 +26,13 @@\n     log_min_duration_statement: '0'\n     log_statement: none\n     log_temp_files: '0'\n-    maintenance_work_mem: 64MB\n+    maintenance_work_mem: 128MB\n     max_connections: 1000\n+    max_parallel_workers: 20\n     max_prepared_transactions: 1000\n     max_stack_depth: 2MB\n     max_wal_size: 1GB\n+    max_worker_processes: 50\n     min_wal_size: 80MB\n     password_encryption: scram-sha-256\n     pgnodemx.kdapi_path: /etc/database-containerinfo\n@@ -40,21 +43,22 @@\n     ssl_ca_file: /pgconf/tls/ca.crt\n     ssl_cert_file: /pgconf/tls/tls.crt\n     ssl_key_file: /pgconf/tls/tls.key\n-    synchronous_commit: 'off'\n+    synchronous_commit: local\n     synchronous_standby_names: '*'\n     unix_socket_directories: /tmp/postgres\n     wal_buffers: 16MB\n     wal_level: logical\n     wal_receiver_status_interval: 1s\n     wal_writer_delay: 10ms\n-    work_mem: 16MB\n+    work_mem: 32MB\n   pg_hba:\n   - local all \"postgres\" peer\n   - hostssl replication \"_crunchyrepl\" all cert\n   - hostssl \"postgres\" \"_crunchyrepl\" all cert\n   - host all \"_crunchyrepl\" all reject\n-  - host all \"ccp_monitoring\" \"127.0.0.0/8\" md5\n-  - host all \"ccp_monitoring\" \"::1/128\" md5\n+  - host all \"ccp_monitoring\" \"127.0.0.0/8\" scram-sha-256\n+  - host all \"ccp_monitoring\" \"::1/128\" scram-sha-256\n+  - host all \"ccp_monitoring\" all reject\n   - hostssl all \"_crunchypgbouncer\" all scram-sha-256\n   - host all \"_crunchypgbouncer\" all reject\n   - host all all 127.0.0.1/32 trust\n" version=5.3.0-0
time="2023-03-30T02:00:22Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=fe3bb221-9e37-428b-877d-bcd4a9b249f3 version=5.3.0-0
time="2023-03-30T02:00:22Z" level=error msg="Reconciler error" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="command terminated with exit code 1" file="internal/controller/postgrescluster/patroni.go:225" func="postgrescluster.(*Reconciler).reconcilePatroniDynamicConfiguration" name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=fe3bb221-9e37-428b-877d-bcd4a9b249f3 version=5.3.0-0
time="2023-03-30T02:00:23Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=3cd91d9d-32c0-4dcf-b340-1fcfe728245e stderr="2023-03-30 02:00:23,227 - ERROR - Unexpected error from Kubernetes API\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 498, in wrapper\n    return func(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 937, in patch_or_create\n    raise e\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 931, in patch_or_create\n    return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 921, in _patch_or_create\n    ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 483, in wrapper\n    return getattr(self._core_v1_api, func)(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 419, in wrapper\n    return self._api_client.call_api(method, path, headers, body, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 388, in call_api\n    return self._handle_server_response(response, _preload_content)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 218, in _handle_server_response\n    raise k8s_client.rest.ApiException(http_resp=response)\npatroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)\nReason: Unprocessable Entity\nHTTP response headers: HTTPHeaderDict({'Audit-Id': '30cef7e4-97e6-4562-8adc-2978ff3b9a3f', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'd8c76dee-5d73-4e14-9caa-711f3aec3afd', 'X-Kubernetes-Pf-Prioritylevel-Uid': '3d34cb27-5ead-4176-a761-38e53c5aabbb', 'Date': 'Thu, 30 Mar 2023 02:00:23 GMT', 'Content-Length': '741'})\nHTTP response body: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Endpoints \\\\\"smps-ha-config\\\\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes\",\"reason\":\"Invalid\",\"details\":{\"name\":\"smps-ha-config\",\"kind\":\"Endpoints\",\"causes\":[{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"}]},\"code\":422}\\n'\n\nError: Config modification aborted due to concurrent changes\n" stdout="--- \n+++ \n@@ -3,12 +3,13 @@\n   parameters:\n     archive_command: pgbackrest --stanza=db archive-push \"%p\"\n     archive_mode: 'on'\n-    archive_timeout: 0\n+    archive_timeout: 300\n     autovacuum: 'on'\n-    autovacuum_max_workers: 3\n-    autovacuum_vacuum_cost_limit: 1000\n+    autovacuum_analyze_scale_factor: 0.01\n+    autovacuum_analyze_threshold: 50\n+    autovacuum_max_workers: 5\n     autovacuum_vacuum_scale_factor: 0.02\n-    autovacuum_vacuum_threshold: 100\n+    autovacuum_vacuum_threshold: 50\n     idle_in_transaction_session_timeout: 600000\n     jit: 'off'\n     lc_messages: en_US.UTF8\n@@ -25,11 +26,13 @@\n     log_min_duration_statement: '0'\n     log_statement: none\n     log_temp_files: '0'\n-    maintenance_work_mem: 64MB\n+    maintenance_work_mem: 128MB\n     max_connections: 1000\n+    max_parallel_workers: 20\n     max_prepared_transactions: 1000\n     max_stack_depth: 2MB\n     max_wal_size: 1GB\n+    max_worker_processes: 50\n     min_wal_size: 80MB\n     password_encryption: scram-sha-256\n     pgnodemx.kdapi_path: /etc/database-containerinfo\n@@ -40,21 +43,22 @@\n     ssl_ca_file: /pgconf/tls/ca.crt\n     ssl_cert_file: /pgconf/tls/tls.crt\n     ssl_key_file: /pgconf/tls/tls.key\n-    synchronous_commit: 'off'\n+    synchronous_commit: local\n     synchronous_standby_names: '*'\n     unix_socket_directories: /tmp/postgres\n     wal_buffers: 16MB\n     wal_level: logical\n     wal_receiver_status_interval: 1s\n     wal_writer_delay: 10ms\n-    work_mem: 16MB\n+    work_mem: 32MB\n   pg_hba:\n   - local all \"postgres\" peer\n   - hostssl replication \"_crunchyrepl\" all cert\n   - hostssl \"postgres\" \"_crunchyrepl\" all cert\n   - host all \"_crunchyrepl\" all reject\n-  - host all \"ccp_monitoring\" \"127.0.0.0/8\" md5\n-  - host all \"ccp_monitoring\" \"::1/128\" md5\n+  - host all \"ccp_monitoring\" \"127.0.0.0/8\" scram-sha-256\n+  - host all \"ccp_monitoring\" \"::1/128\" scram-sha-256\n+  - host all \"ccp_monitoring\" all reject\n   - hostssl all \"_crunchypgbouncer\" all scram-sha-256\n   - host all \"_crunchypgbouncer\" all reject\n   - host all all 127.0.0.1/32 trust\n" version=5.3.0-0
time="2023-03-30T02:00:23Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=3cd91d9d-32c0-4dcf-b340-1fcfe728245e version=5.3.0-0
time="2023-03-30T02:00:23Z" level=error msg="Reconciler error" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="command terminated with exit code 1" file="internal/controller/postgrescluster/patroni.go:225" func="postgrescluster.(*Reconciler).reconcilePatroniDynamicConfiguration" name=smps namespace=int-smps postgresCluster=int-smps/smps reconcileID=3cd91d9d-32c0-4dcf-b340-1fcfe728245e version=5.3.0-0
time="2023-03-30T02:00:33Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:34Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c version=5.3.0-0
time="2023-03-30T02:00:34Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c version=5.3.0-0
time="2023-03-30T02:00:34Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c version=5.3.0-0
time="2023-03-30T02:00:34Z" level=error msg="unable to reconcile manual backup" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="jobs.batch \"pnst-backup-9vfg\" not found" file="internal/controller/postgrescluster/pgbackrest.go:2106" func="postgrescluster.(*Reconciler).reconcileManualBackup" name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c reconciler=pgBackRest version=5.3.0-0
time="2023-03-30T02:00:34Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c version=5.3.0-0
time="2023-03-30T02:00:34Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0e7233a6-7728-4433-b3a4-6b247cc3794c version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d version=5.3.0-0
time="2023-03-30T02:00:35Z" level=error msg="unable to reconcile manual backup" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="jobs.batch \"pnst-backup-cw5h\" not found" file="internal/controller/postgrescluster/pgbackrest.go:2106" func="postgrescluster.(*Reconciler).reconcileManualBackup" name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d reconciler=pgBackRest version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d version=5.3.0-0
time="2023-03-30T02:00:35Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=92d03bba-3d44-4803-abf1-3baddfd2a11d version=5.3.0-0
time="2023-03-30T02:00:36Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:36Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 version=5.3.0-0
time="2023-03-30T02:00:36Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 version=5.3.0-0
time="2023-03-30T02:00:36Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 version=5.3.0-0
time="2023-03-30T02:00:37Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 version=5.3.0-0
time="2023-03-30T02:00:37Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=4b8409e5-4615-43fc-8376-eed6ad23ba97 version=5.3.0-0
time="2023-03-30T02:00:37Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:38Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 version=5.3.0-0
time="2023-03-30T02:00:38Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 version=5.3.0-0
time="2023-03-30T02:00:38Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 version=5.3.0-0
time="2023-03-30T02:00:38Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 version=5.3.0-0
time="2023-03-30T02:00:38Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c3e2256-750b-4ce6-9977-e103597ec755 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=059a3168-cccc-408a-8d81-d8cce11e1156 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=19efb485-4dd6-42da-ac2e-e9e954849ad8 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=smps-instance1-cqpb name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=19efb485-4dd6-42da-ac2e-e9e954849ad8 version=5.3.0-0
time="2023-03-30T02:00:39Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=19efb485-4dd6-42da-ac2e-e9e954849ad8 version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=19efb485-4dd6-42da-ac2e-e9e954849ad8 version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=19efb485-4dd6-42da-ac2e-e9e954849ad8 version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2f085088-a404-41aa-b4a7-d44eb664026a stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2f085088-a404-41aa-b4a7-d44eb664026a version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2f085088-a404-41aa-b4a7-d44eb664026a version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2f085088-a404-41aa-b4a7-d44eb664026a version=5.3.0-0
time="2023-03-30T02:00:40Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2f085088-a404-41aa-b4a7-d44eb664026a version=5.3.0-0
time="2023-03-30T02:00:41Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=cf98ef4a-7f17-44b8-a8ca-af74c32305cb stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:41Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=smps-instance1-cqpb name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=cf98ef4a-7f17-44b8-a8ca-af74c32305cb version=5.3.0-0
time="2023-03-30T02:00:41Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=cf98ef4a-7f17-44b8-a8ca-af74c32305cb version=5.3.0-0
time="2023-03-30T02:00:41Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=cf98ef4a-7f17-44b8-a8ca-af74c32305cb version=5.3.0-0
time="2023-03-30T02:00:41Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=cf98ef4a-7f17-44b8-a8ca-af74c32305cb version=5.3.0-0
time="2023-03-30T02:00:42Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=ae36e9d6-4312-4ade-9d72-d568c278e382 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:42Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=smps-instance1-cqpb name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=ae36e9d6-4312-4ade-9d72-d568c278e382 version=5.3.0-0
time="2023-03-30T02:00:42Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=ae36e9d6-4312-4ade-9d72-d568c278e382 version=5.3.0-0
time="2023-03-30T02:00:42Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=smps namespace=int2-smps postgresCluster=int2-smps/smps reconcileID=ae36e9d6-4312-4ade-9d72-d568c278e382 version=5.3.0-0
time="2023-03-30T02:00:48Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=6fe002e7-fe5f-4867-8891-2aae28bc7002 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:00:48Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=6fe002e7-fe5f-4867-8891-2aae28bc7002 version=5.3.0-0
time="2023-03-30T02:00:48Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=6fe002e7-fe5f-4867-8891-2aae28bc7002 version=5.3.0-0
time="2023-03-30T02:00:48Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=6fe002e7-fe5f-4867-8891-2aae28bc7002 version=5.3.0-0
time="2023-03-30T02:00:48Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=6fe002e7-fe5f-4867-8891-2aae28bc7002 version=5.3.0-0
time="2023-03-30T02:01:08Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=971b22a0-6cab-4a19-8cf3-547e40b34f03 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:01:08Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=971b22a0-6cab-4a19-8cf3-547e40b34f03 version=5.3.0-0
time="2023-03-30T02:01:08Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=971b22a0-6cab-4a19-8cf3-547e40b34f03 version=5.3.0-0
time="2023-03-30T02:01:08Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=971b22a0-6cab-4a19-8cf3-547e40b34f03 version=5.3.0-0
time="2023-03-30T02:01:08Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=971b22a0-6cab-4a19-8cf3-547e40b34f03 version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 version=5.3.0-0
time="2023-03-30T02:01:16Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=e2e852c1-ede6-4da0-b821-31a228150936 version=5.3.0-0
time="2023-03-30T02:01:17Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=3fe42714-d155-4e4e-9d66-e3bf3ef5701c stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:01:17Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=3fe42714-d155-4e4e-9d66-e3bf3ef5701c version=5.3.0-0
time="2023-03-30T02:01:17Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=3fe42714-d155-4e4e-9d66-e3bf3ef5701c version=5.3.0-0
time="2023-03-30T02:01:17Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=3fe42714-d155-4e4e-9d66-e3bf3ef5701c version=5.3.0-0
time="2023-03-30T02:01:17Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=3fe42714-d155-4e4e-9d66-e3bf3ef5701c version=5.3.0-0

Eric-zch avatar Mar 30 '23 02:03 Eric-zch

Hi @dsessler7

Below is my testing for one-off backup based on https://access.crunchydata.com/documentation/postgres-operator/v5/tutorial/backup-management/

[zhaoch@~]$oc annotate postgrescluster pnst --overwrite postgres-operator.crunchydata.com/pgbackrest-backup="$(date)"
postgrescluster.postgres-operator.crunchydata.com/pnst annotated

[zhaoch@~]$oc pgo show backup pnst
stanza: db
    status: ok
    cipher: none
    ......
        full backup: 20230330-024013F
            timestamp start/stop: 2023-03-30 02:40:13 / 2023-03-30 02:40:44
            wal start/stop: 0000000700000005000000AE / 0000000700000005000000AE
            database size: 1.1GB, database backup size: 1.1GB
            repo1: backup set size: 93.4MB, backup size: 93.4MB

        full backup: 20230330-024133F
            timestamp start/stop: 2023-03-30 02:41:33 / 2023-03-30 02:42:03
            wal start/stop: 0000000700000005000000B0 / 0000000700000005000000B0
            database size: 1.1GB, database backup size: 1.1GB
            repo1: backup set size: 93.4MB, backup size: 93.4MB

There were 2 jobs (pnst-backup-r694,pnst-backup-sgjr) started for the backup.

[zhaoch@~]$oc get job
NAME                       COMPLETIONS   DURATION   AGE
pnst-backup-gx8j           1/1           115s       6d19h
pnst-backup-r694           1/1           116s       2m15s
pnst-backup-sgjr           1/1           37s        2m15s
pnst-repo1-full-27995040   1/1           57s        5d2h
pnst-repo1-incr-27999420   1/1           14s        2d1h
pnst-repo1-incr-28000860   1/1           15s        25h
pnst-repo1-incr-28002300   1/1           15s        102m
[zhaoch@~]$

There were 5 pods started for the backup. Error Pods: pnst-backup-r694-f7r7n,pnst-backup-r694-vcjzt,pnst-backup-r694-zsjkx Completed Pods: pnst-backup-r694-m7dc4, pnst-backup-sgjr-2wlvc

[zhaoch@~]$oc get po
NAME                             READY   STATUS      RESTARTS   AGE
asb-activemq5-6bff45c5b8-j9fhf   1/1     Running     0          9d
pnst-backup-gx8j-zf42g           0/1     Completed   0          6d19h
pnst-backup-r694-f7r7n           0/1     Error       0          2m23s
pnst-backup-r694-m7dc4           0/1     Completed   0          64s
pnst-backup-r694-vcjzt           0/1     Error       0          2m13s
pnst-backup-r694-zsjkx           0/1     Error       0          113s
pnst-backup-sgjr-2wlvc           0/1     Completed   0          2m23s
pnst-instance1-hjkm-0            5/5     Running     0          6d18h
pnst-instance1-tw8w-0            5/5     Running     0          5d21h
pnst-pgbouncer-647dcd786-4t2p7   2/2     Running     0          6d19h
pnst-repo-host-0                 2/2     Running     0          6d19h
pnst-repo1-full-27995040-9ntdd   0/1     Completed   0          5d2h
pnst-repo1-incr-27999420-w2b2v   0/1     Completed   0          2d1h
pnst-repo1-incr-28000860-zzfdz   0/1     Completed   0          25h
pnst-repo1-incr-28002300-9kwsm   0/1     Completed   0          102m
[zhaoch@~]$

Pod messages

[zhaoch@~]$oc logs -f pnst-backup-r694-f7r7n
time="2023-03-30T02:40:12Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:40:12Z" level=info msg="debug flag set to false"
time="2023-03-30T02:40:12Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:40:12Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:40:12Z" level=info msg="output=[]"
time="2023-03-30T02:40:12Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:40:12Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$oc logs -f pnst-backup-r694-vcjzt
time="2023-03-30T02:40:22Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:40:22Z" level=info msg="debug flag set to false"
time="2023-03-30T02:40:22Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:40:22Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:40:22Z" level=info msg="output=[]"
time="2023-03-30T02:40:22Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:40:22Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$oc logs -f pnst-backup-r694-zsjkx
time="2023-03-30T02:40:42Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:40:42Z" level=info msg="debug flag set to false"
time="2023-03-30T02:40:42Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:40:42Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:40:42Z" level=info msg="output=[]"
time="2023-03-30T02:40:42Z" level=info msg="stderr=[ERROR: [050]: unable to acquire lock on file '/tmp/pgbackrest/db-backup.lock': Resource temporarily unavailable\n       HINT: is another pgBackRest process running?\n]"
time="2023-03-30T02:40:42Z" level=fatal msg="command terminated with exit code 50"
[zhaoch@~]$oc logs -f pnst-backup-r694-m7dc4
time="2023-03-30T02:41:31Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:41:31Z" level=info msg="debug flag set to false"
time="2023-03-30T02:41:31Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:41:31Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:42:04Z" level=info msg="output=[]"
time="2023-03-30T02:42:04Z" level=info msg="stderr=[]"
time="2023-03-30T02:42:04Z" level=info msg="crunchy-pgbackrest ends"
[zhaoch@~]$oc logs -f pnst-backup-sgjr-2wlvc
time="2023-03-30T02:40:12Z" level=info msg="crunchy-pgbackrest starts"
time="2023-03-30T02:40:12Z" level=info msg="debug flag set to false"
time="2023-03-30T02:40:12Z" level=info msg="backrest backup command requested"
time="2023-03-30T02:40:12Z" level=info msg="command to execute is [pgbackrest backup --stanza=db --repo=1 --type=full]"
time="2023-03-30T02:40:44Z" level=info msg="output=[]"
time="2023-03-30T02:40:44Z" level=info msg="stderr=[]"
time="2023-03-30T02:40:44Z" level=info msg="crunchy-pgbackrest ends"
[zhaoch@~]$

Opereator Pod messages during the backup.

[zhaoch@~]$oc logs -f pgo-d96694788-zj42s -n crunchy-postgres-operator
......
time="2023-03-30T02:40:07Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:07Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 version=5.3.0-0
time="2023-03-30T02:40:07Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 version=5.3.0-0
time="2023-03-30T02:40:07Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 version=5.3.0-0
time="2023-03-30T02:40:07Z" level=error msg="unable to reconcile manual backup" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="jobs.batch \"pnst-backup-2p8h\" not found" file="internal/controller/postgrescluster/pgbackrest.go:2106" func="postgrescluster.(*Reconciler).reconcileManualBackup" name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 reconciler=pgBackRest version=5.3.0-0
time="2023-03-30T02:40:07Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=9b54ba79-2d6d-4623-8cb9-1941bf9d4bd0 version=5.3.0-0
time="2023-03-30T02:40:08Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:08Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 version=5.3.0-0
time="2023-03-30T02:40:08Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 version=5.3.0-0
time="2023-03-30T02:40:08Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 version=5.3.0-0
time="2023-03-30T02:40:09Z" level=error msg="unable to reconcile manual backup" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="jobs.batch \"pnst-backup-vbrw\" not found" file="internal/controller/postgrescluster/pgbackrest.go:2106" func="postgrescluster.(*Reconciler).reconcileManualBackup" name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 reconciler=pgBackRest version=5.3.0-0
time="2023-03-30T02:40:09Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 version=5.3.0-0
time="2023-03-30T02:40:09Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=fbdcb721-534d-4655-a2e8-bcd732004532 version=5.3.0-0
time="2023-03-30T02:40:09Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:10Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 version=5.3.0-0
time="2023-03-30T02:40:10Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 version=5.3.0-0
time="2023-03-30T02:40:10Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 version=5.3.0-0
time="2023-03-30T02:40:10Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 version=5.3.0-0
time="2023-03-30T02:40:10Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8cd882c0-c4b7-4733-870c-9d2a58f230e7 version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 version=5.3.0-0
time="2023-03-30T02:40:11Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=223f5a7c-86ef-4b77-864f-1d0ab68175e4 version=5.3.0-0
time="2023-03-30T02:40:12Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:12Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c version=5.3.0-0
time="2023-03-30T02:40:12Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c version=5.3.0-0
time="2023-03-30T02:40:12Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c version=5.3.0-0
time="2023-03-30T02:40:13Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c version=5.3.0-0
time="2023-03-30T02:40:13Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=b693c7e6-8132-4723-99c3-0742861fdc8c version=5.3.0-0
time="2023-03-30T02:40:14Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=674f30fe-912e-4dda-b41d-c02c3bbc0e31 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:14Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=674f30fe-912e-4dda-b41d-c02c3bbc0e31 version=5.3.0-0
time="2023-03-30T02:40:14Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=674f30fe-912e-4dda-b41d-c02c3bbc0e31 version=5.3.0-0
time="2023-03-30T02:40:14Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=674f30fe-912e-4dda-b41d-c02c3bbc0e31 version=5.3.0-0
time="2023-03-30T02:40:14Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=674f30fe-912e-4dda-b41d-c02c3bbc0e31 version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c version=5.3.0-0
time="2023-03-30T02:40:21Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=8207cbcf-41ff-40e0-b4fc-a1ca7e2c832c version=5.3.0-0
time="2023-03-30T02:40:22Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c768bde-34e5-48a8-99c2-4c2524aed72a stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:22Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c768bde-34e5-48a8-99c2-4c2524aed72a version=5.3.0-0
time="2023-03-30T02:40:22Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c768bde-34e5-48a8-99c2-4c2524aed72a version=5.3.0-0
time="2023-03-30T02:40:22Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c768bde-34e5-48a8-99c2-4c2524aed72a version=5.3.0-0
time="2023-03-30T02:40:22Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=0c768bde-34e5-48a8-99c2-4c2524aed72a version=5.3.0-0
time="2023-03-30T02:40:33Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=oc namespace=int2-ocz postgresCluster=int2-ocz/oc reconcileID=6f497c22-5fc1-4405-b994-4cd1431f4ff6 stderr="2023-03-30 02:40:33,552 - ERROR - Unexpected error from Kubernetes API\nTraceback (most recent call last):\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 498, in wrapper\n    return func(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 937, in patch_or_create\n    raise e\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 931, in patch_or_create\n    return self._patch_or_create(name, annotations, resource_version, patch, retry, ips)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 921, in _patch_or_create\n    ret = retry(func, self._namespace, body) if retry else func(self._namespace, body)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 483, in wrapper\n    return getattr(self._core_v1_api, func)(*args, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 419, in wrapper\n    return self._api_client.call_api(method, path, headers, body, **kwargs)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 388, in call_api\n    return self._handle_server_response(response, _preload_content)\n  File \"/usr/local/lib/python3.6/site-packages/patroni/dcs/kubernetes.py\", line 218, in _handle_server_response\n    raise k8s_client.rest.ApiException(http_resp=response)\npatroni.dcs.kubernetes.K8sClient.rest.ApiException: (422)\nReason: Unprocessable Entity\nHTTP response headers: HTTPHeaderDict({'Audit-Id': '02447031-8911-498c-9867-04dcc982f847', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Kubernetes-Pf-Flowschema-Uid': 'd8c76dee-5d73-4e14-9caa-711f3aec3afd', 'X-Kubernetes-Pf-Prioritylevel-Uid': '3d34cb27-5ead-4176-a761-38e53c5aabbb', 'Date': 'Thu, 30 Mar 2023 02:40:33 GMT', 'Content-Length': '737'})\nHTTP response body: b'{\"kind\":\"Status\",\"apiVersion\":\"v1\",\"metadata\":{},\"status\":\"Failure\",\"message\":\"Endpoints \\\\\"oc-ha-config\\\\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes\",\"reason\":\"Invalid\",\"details\":{\"name\":\"oc-ha-config\",\"kind\":\"Endpoints\",\"causes\":[{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"},{\"reason\":\"FieldValueTooLong\",\"message\":\"Too long: must have at most 262144 bytes\",\"field\":\"metadata.annotations\"}]},\"code\":422}\\n'\n\nError: Config modification aborted due to concurrent changes\n" stdout="--- \n+++ \n@@ -3,12 +3,13 @@\n   parameters:\n     archive_command: pgbackrest --stanza=db archive-push \"%p\"\n     archive_mode: 'on'\n-    archive_timeout: 0\n+    archive_timeout: 300\n     autovacuum: 'on'\n-    autovacuum_max_workers: 3\n-    autovacuum_vacuum_cost_limit: 1000\n+    autovacuum_analyze_scale_factor: 0.01\n+    autovacuum_analyze_threshold: 50\n+    autovacuum_max_workers: 5\n     autovacuum_vacuum_scale_factor: 0.02\n-    autovacuum_vacuum_threshold: 100\n+    autovacuum_vacuum_threshold: 50\n     cron.database_name: OC\n     cron.use_background_workers: 'on'\n     idle_in_transaction_session_timeout: 600000\n@@ -27,11 +28,13 @@\n     log_min_duration_statement: '0'\n     log_statement: none\n     log_temp_files: '0'\n-    maintenance_work_mem: 64MB\n+    maintenance_work_mem: 128MB\n     max_connections: 1000\n+    max_parallel_workers: 20\n     max_prepared_transactions: 1000\n     max_stack_depth: 2MB\n     max_wal_size: 1GB\n+    max_worker_processes: 50\n     min_wal_size: 80MB\n     password_encryption: scram-sha-256\n     pgnodemx.kdapi_path: /etc/database-containerinfo\n@@ -42,14 +45,14 @@\n     ssl_ca_file: /pgconf/tls/ca.crt\n     ssl_cert_file: /pgconf/tls/tls.crt\n     ssl_key_file: /pgconf/tls/tls.key\n-    synchronous_commit: 'off'\n+    synchronous_commit: local\n     synchronous_standby_names: '*'\n     unix_socket_directories: /tmp/postgres\n     wal_buffers: 16MB\n     wal_level: logical\n     wal_receiver_status_interval: 1s\n     wal_writer_delay: 10ms\n-    work_mem: 16MB\n+    work_mem: 32MB\n   pg_hba:\n   - local all \"postgres\" peer\n   - hostssl replication \"_crunchyrepl\" all cert\n" version=5.3.0-0
time="2023-03-30T02:40:33Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=oc namespace=int2-ocz postgresCluster=int2-ocz/oc reconcileID=6f497c22-5fc1-4405-b994-4cd1431f4ff6 version=5.3.0-0
time="2023-03-30T02:40:33Z" level=error msg="Reconciler error" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster error="command terminated with exit code 1" file="internal/controller/postgrescluster/patroni.go:225" func="postgrescluster.(*Reconciler).reconcilePatroniDynamicConfiguration" name=oc namespace=int2-ocz postgresCluster=int2-ocz/oc reconcileID=6f497c22-5fc1-4405-b994-4cd1431f4ff6 version=5.3.0-0
time="2023-03-30T02:40:41Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:41Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c version=5.3.0-0
time="2023-03-30T02:40:41Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c version=5.3.0-0
time="2023-03-30T02:40:41Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c version=5.3.0-0
time="2023-03-30T02:40:41Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c version=5.3.0-0
time="2023-03-30T02:40:42Z" level=debug msg="patched cluster status" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=c9fa4524-58dd-417f-9f1a-a745c7af7d4c version=5.3.0-0
time="2023-03-30T02:40:42Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=81711609-f3e6-4257-aed5-ed3240901206 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:42Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=81711609-f3e6-4257-aed5-ed3240901206 version=5.3.0-0
time="2023-03-30T02:40:42Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=81711609-f3e6-4257-aed5-ed3240901206 version=5.3.0-0
time="2023-03-30T02:40:42Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=81711609-f3e6-4257-aed5-ed3240901206 version=5.3.0-0
time="2023-03-30T02:40:43Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=81711609-f3e6-4257-aed5-ed3240901206 version=5.3.0-0
time="2023-03-30T02:40:48Z" level=debug msg="replaced configuration" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2b859848-9551-4119-b580-871d41dc7595 stderr= stdout="Not changed\n" version=5.3.0-0
time="2023-03-30T02:40:48Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-tw8w name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2b859848-9551-4119-b580-871d41dc7595 version=5.3.0-0
time="2023-03-30T02:40:48Z" level=debug msg="reconciled instance" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance=pnst-instance1-hjkm name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2b859848-9551-4119-b580-871d41dc7595 version=5.3.0-0
time="2023-03-30T02:40:48Z" level=debug msg="reconciled instance set" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster instance-set=instance1 name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2b859848-9551-4119-b580-871d41dc7595 version=5.3.0-0
time="2023-03-30T02:40:48Z" level=debug msg="reconciled cluster" controller=postgrescluster controllerGroup=postgres-operator.crunchydata.com controllerKind=PostgresCluster name=pnst namespace=dev-pnst-test postgresCluster=dev-pnst-test/pnst reconcileID=2b859848-9551-4119-b580-871d41dc7595 version=5.3.0-0

Eric-zch avatar Mar 30 '23 02:03 Eric-zch

Would you please send me the results of the following commands:

kubectl get deployment,pod -A -l postgres-operator.crunchydata.com/control-plane

and

kubectl get endpoints/pnst-ha-config -o yaml -n postgres-operator

dsessler7 avatar Mar 30 '23 23:03 dsessler7

Hi @dsessler7

[zhaoch@x86_64]$kubectl get deployment,pod -A -l postgres-operator.crunchydata.com/control-plane
NAMESPACE                   NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
crunchy-postgres-operator   deployment.apps/pgo           1/1     1            1           197d
crunchy-postgres-operator   deployment.apps/pgo-upgrade   1/1     1            1           197d

NAMESPACE                   NAME                               READY   STATUS    RESTARTS   AGE
crunchy-postgres-operator   pod/pgo-d96694788-zj42s            1/1     Running   0          9d
crunchy-postgres-operator   pod/pgo-upgrade-7768b5c9cf-nhvxl   1/1     Running   0          9d
openshift-operators         pod/pgo-6b94557496-dkzsb           1/1     Running   0          10d
[zhaoch@x86_64]$

[zhaoch@x86_64]$kubectl get endpoints/pnst-ha-config -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    config: '{"loop_wait": 10, "postgresql": {"parameters": {"archive_command": "pgbackrest
      --stanza=db archive-push \"%p\"", "archive_mode": "on", "archive_timeout": 300,
      "autovacuum": "on", "autovacuum_analyze_scale_factor": 0.01, "autovacuum_analyze_threshold":
      50, "autovacuum_max_workers": 5, "autovacuum_vacuum_scale_factor": 0.02, "autovacuum_vacuum_threshold":
      50, "cron.database_name": "PNST", "cron.use_background_workers": "on", "idle_in_transaction_session_timeout":
      600000, "jit": "off", "lc_messages": "en_US.UTF8", "lock_timeout": 600000, "log_autovacuum_min_duration":
      "0", "log_checkpoints": "on", "log_connections": "on", "log_destination": "stderr",
      "log_disconnections": "on", "log_duration": "off", "log_error_verbosity": "verbose",
      "log_line_prefix": "%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ", "log_lock_waits":
      "on", "log_min_duration_statement": "0", "log_statement": "none", "log_temp_files":
      "0", "maintenance_work_mem": "128MB", "max_connections": 1000, "max_prepared_transactions":
      1000, "max_stack_depth": "2MB", "max_wal_size": "1GB", "min_wal_size": "80MB",
      "password_encryption": "scram-sha-256", "pgnodemx.kdapi_path": "/etc/database-containerinfo",
      "restore_command": "pgbackrest --stanza=db archive-get %f \"%p\"", "shared_buffers":
      "0.25GB", "shared_preload_libraries": "pg_stat_statements,pgnodemx,pgaudit,pg_cron",
      "ssl": "on", "ssl_ca_file": "/pgconf/tls/ca.crt", "ssl_cert_file": "/pgconf/tls/tls.crt",
      "ssl_key_file": "/pgconf/tls/tls.key", "synchronous_commit": "local", "synchronous_standby_names":
      "*", "unix_socket_directories": "/tmp/postgres", "wal_buffers": "16MB", "wal_level":
      "logical", "wal_receiver_status_interval": "1s", "wal_writer_delay": "10ms",
      "work_mem": "32MB"}, "pg_hba": ["local all \"postgres\" peer", "hostssl replication
      \"_crunchyrepl\" all cert", "hostssl \"postgres\" \"_crunchyrepl\" all cert",
      "host all \"_crunchyrepl\" all reject", "host all \"ccp_monitoring\" \"127.0.0.0/8\"
      scram-sha-256", "host all \"ccp_monitoring\" \"::1/128\" scram-sha-256", "host
      all \"ccp_monitoring\" all reject", "hostssl all \"_crunchypgbouncer\" all scram-sha-256",
      "host all \"_crunchypgbouncer\" all reject", "host all all 127.0.0.1/32 trust",
      "host all all localhost trust", "host all all 0.0.0.0/0 md5"], "use_pg_rewind":
      true, "use_slots": false}, "synchronous_mode": false, "synchronous_mode_strict":
      false, "synchronous_node_count": 1, "ttl": 30}'
    history: '[[1,117440824,"reached consistency"],[3,201326848,"reached consistency"],[5,268435768,"reached
      consistency"]]'
    initialize: "7213631986153492562"
  creationTimestamp: "2023-03-23T07:52:59Z"
  labels:
    postgres-operator.crunchydata.com/cluster: pnst
    postgres-operator.crunchydata.com/patroni: pnst-ha
  name: pnst-ha-config
  namespace: dev-pnst-test
  resourceVersion: "1065434434"
  uid: 37356269-a077-4559-91f7-11638ea1c6f5
[zhaoch@x86_64]$

Eric-zch avatar Mar 31 '23 00:03 Eric-zch

So, you are getting duplicate backups because you have two operators running:

crunchy-postgres-operator   pod/pgo-d96694788-zj42s            1/1     Running   0          9d
crunchy-postgres-operator   pod/pgo-upgrade-7768b5c9cf-nhvxl   1/1     Running   0          9d
openshift-operators         pod/pgo-6b94557496-dkzsb           1/1     Running   0          10d

Having multiple operators running can definitely cause weird behavior, so pick one and stick with it.

The other thing I asked for was actually the wrong endpoint, but regardless, you're getting errors from the Kubernetes API saying that the annotations field on a couple endpoints are too large:

"Failure\",\"message\":\"Endpoints \\\\\"oc-ha-config\\\\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes\"

"Failure\",\"message\":\"Endpoints \\\\\"smps-ha-config\\\\\" is invalid: metadata.annotations: Too long: must have at most 262144 bytes\"

Could you send the kubectl get endpoints output that I requested before but targeting these endpoints?

dsessler7 avatar Mar 31 '23 17:03 dsessler7

Hi @dsessler7 Thanks for identifying the root cause of the duplicated database backup, just as you said, it is related to two operators running, and I did not realize that our openshift admin have installed a Crunchy postgres Operator before. After I removed the operator under openshift-operators namespace, the database backup issue is resolved.

[zhaoch@~]$kubectl get deployment,pod -A -l postgres-operator.crunchydata.com/control-plane
NAMESPACE                   NAME                          READY   UP-TO-DATE   AVAILABLE   AGE
crunchy-postgres-operator   deployment.apps/pgo           1/1     1            1           200d
crunchy-postgres-operator   deployment.apps/pgo-upgrade   1/1     1            1           200d

NAMESPACE                   NAME                               READY   STATUS    RESTARTS   AGE
crunchy-postgres-operator   pod/pgo-d96694788-zj42s            1/1     Running   0          13d
crunchy-postgres-operator   pod/pgo-upgrade-7768b5c9cf-nhvxl   1/1     Running   0          13d
[zhaoch@~]$kubectl get endpoints
NAME             ENDPOINTS                                                            AGE
asb-activemq5    172.17.50.78:61616,172.17.50.78:8080,172.17.50.78:8161 + 2 more...   297d
pnst-ha          172.17.20.207:5432                                                   10d
pnst-ha-config   <none>                                                               10d
pnst-pgbouncer   172.17.26.183:5432                                                   10d
pnst-pods        172.17.15.242,172.17.20.207,172.17.26.158 + 2 more...                10d
pnst-primary     172.21.13.151:5432                                                   10d
pnst-replicas    172.17.26.158:5432                                                   10d

[zhaoch@~]$kubectl get endpoints/smps-ha-config -o yaml
apiVersion: v1
kind: Endpoints
metadata:
  annotations:
    config: '{"loop_wait": 10, "postgresql": {"parameters": {"archive_command": "pgbackrest
      --stanza=db archive-push \"%p\"", "archive_mode": "on", "archive_timeout": 0,
      "autovacuum": "on", "autovacuum_max_workers": 3, "autovacuum_vacuum_cost_limit":
      1000, "autovacuum_vacuum_scale_factor": 0.02, "autovacuum_vacuum_threshold":
      100, "idle_in_transaction_session_timeout": 600000, "jit": "off", "lc_messages":
      "en_US.UTF8", "lock_timeout": 600000, "log_autovacuum_min_duration": "0", "log_checkpoints":
      "on", "log_connections": "on", "log_destination": "stderr", "log_disconnections":
      "on", "log_duration": "off", "log_error_verbosity": "verbose", "log_line_prefix":
      "%t [%p]: [%l-1] user=%u,db=%d,app=%a,client=%h ", "log_lock_waits": "on", "log_min_duration_statement":
      "0", "log_statement": "none", "log_temp_files": "0", "maintenance_work_mem":
      "64MB", "max_connections": 1000, "max_prepared_transactions": 1000, "max_stack_depth":
      "2MB", "max_wal_size": "1GB", "min_wal_size": "80MB", "password_encryption":
      "scram-sha-256", "pgnodemx.kdapi_path": "/etc/database-containerinfo", "restore_command":
      "pgbackrest --stanza=db archive-get %f \"%p\"", "shared_buffers": "0.25GB",
      "shared_preload_libraries": "pg_stat_statements,pgnodemx,pgaudit", "ssl": "on",
      "ssl_ca_file": "/pgconf/tls/ca.crt", "ssl_cert_file": "/pgconf/tls/tls.crt",
      "ssl_key_file": "/pgconf/tls/tls.key", "synchronous_commit": "off", "synchronous_standby_names":
      "*", "unix_socket_directories": "/tmp/postgres", "wal_buffers": "16MB", "wal_level":
      "logical", "wal_receiver_status_interval": "1s", "wal_writer_delay": "10ms",
      "work_mem": "16MB"}, "pg_hba": ["local all \"postgres\" peer", "hostssl replication
      \"_crunchyrepl\" all cert", "hostssl \"postgres\" \"_crunchyrepl\" all cert",
      "host all \"_crunchyrepl\" all reject", "host all \"ccp_monitoring\" \"127.0.0.0/8\"
      md5", "host all \"ccp_monitoring\" \"::1/128\" md5", "hostssl all \"_crunchypgbouncer\"
      all scram-sha-256", "host all \"_crunchypgbouncer\" all reject", "host all all
      127.0.0.1/32 trust", "host all all localhost trust", "host all all 0.0.0.0/0
      md5"], "use_pg_rewind": true, "use_slots": false}, "synchronous_mode": false,
      "synchronous_mode_strict": false, "synchronous_node_count": 1, "ttl": 30}'
    history: '[[1,263268119608,"no recovery target specified","2022-12-20T03:01:33.061389+00:00","smps-instance1-fqlq-0"],[2,840504967328,"no
      recovery target specified","2023-01-14T00:14:52.013951+00:00","smps-instance1-ds4f-0"],[3,840874066080,"no
      recovery target specified","2023-01-14T00:37:02.751580+00:00","smps-instance1-fqlq-0"],[4,893973954720,"no
      recovery target specified","2023-01-16T05:46:54.144728+00:00","smps-instance1-fqlq-0"],[5,910482735264,"no
      recovery target specified","2023-01-16T08:07:27.236165+00:00","smps-instance1-ds4f-0"],[6,910801502368,"no
      recovery target specified","2023-01-16T08:09:27.463841+00:00","smps-instance1-fqlq-0"],[7,938148364448,
     **_<<a lot of rows like above ignore>>_**
    initialize: "7174956851541831770"
  creationTimestamp: "2022-12-09T01:37:01Z"
  labels:
    postgres-operator.crunchydata.com/cluster: smps
    postgres-operator.crunchydata.com/patroni: smps-ha
  name: smps-ha-config
  namespace: int-smps
  resourceVersion: "1051720890"
  uid: 4f005814-0470-4bf3-a49c-86a95c237370
[zhaoch@~]$

I tested one-off backup.

oc pgo backup pnst --repoName=repo1 --options="--type=full"
oc pgo backup pnst --repoName=repo1 --options="--type=incr"

There is no such problems now.

Thanks a lot for your help!

Eric-zch avatar Apr 03 '23 04:04 Eric-zch

endpoint_output.txt Full output from kubectl get endpoints/smps-ha-config -o yaml

Eric-zch avatar Apr 03 '23 04:04 Eric-zch

@Eric-zch, glad you got the duplicate backup issue resolved.

As far as the large endpoint annotations go, the massive history annotation is clearly the problem here. This annotation can fill up when there are a lot of failovers/switchovers. You might copy save the contents of the history annotation somewhere and then clear it out to get rid of the Kubernetes API error.

dsessler7 avatar Apr 03 '23 18:04 dsessler7

@dsessler7 I have cleared the history annotation. It seems we can use a setting like: max_timelines_history: 10 in the postgrescluster yaml file, but after I set it, it did not prune old history annotation automatically

  patroni:
    leaderLeaseDurationSeconds: 30
    port: 8008
    syncPeriodSeconds: 10
    dynamicConfiguration:
      synchronous_mode: false
      synchronous_mode_strict: false
      synchronous_node_count: 1
      max_timelines_history: 10

Eric-zch avatar Apr 04 '23 05:04 Eric-zch

Noted. I've created a story in our backlog.

dsessler7 avatar Apr 04 '23 18:04 dsessler7