postgres-operator icon indicating copy to clipboard operation
postgres-operator copied to clipboard

WAL segment xxxx was not archived before the 60000ms timeout

Open abelfodil opened this issue 4 months ago • 2 comments

Please ensure you do the following when reporting a bug:

  • [x] Provide a concise description of what the bug is.
  • [x] Provide information about your environment.
  • [x] Provide clear steps to reproduce the bug.
  • [x] Attach applicable logs. Please do not attach screenshots showing logs unless you are unable to copy and paste the log data.
  • [x] Ensure any code / output examples are properly formatted for legibility.

Note that some logs needed to troubleshoot may be found in the /pgdata/<CLUSTERNAME>/pg_log directory on your Postgres instance.

An incomplete bug report can lead to delays in resolving the issue or the closing of a ticket, so please be as detailed as possible.

If you are looking for general support, please view the support page for where you can ask questions.

Thanks for reporting the issue, we're looking forward to helping you!

Overview

On crunchydata/pgo:5.8.3, when backing up the cluster (both locally and on B2), a failure happens:

2025-09-05 12:45:04.969 P00 ERROR: [082]: WAL segment 000000090000406B00000066 was not archived before the 60000ms timeout
HINT: check the archive_command to ensure that all options are correct (especially --stanza).
HINT: check the PostgreSQL server log for errors.
HINT: run the 'start' command if the stanza was previously stopped.

Environment

Please provide the following details:

  • Platform: k3s
  • Platform Version: 1.33.4
  • PGO Image Tag: ubi9-17.5-2520
  • Postgres Version 17.5
  • Storage: Local Path

Steps to Reproduce

REPRO

Provide steps to get to the error condition:

  1. Spin up a cluster using crunchydata/pgo:5.8.2
  2. Upgrade the cluster to crunchydata/pgo:5.8.3 but stick the Postgres image to ubi9-17.5-2520
  3. Try to back up the cluster (incr or full)

EXPECTED

Back up to succeed

ACTUAL

Back up fails

Logs

pgdata/pgbackrest/log/db-archive-get-async.log:

2025-09-08 12:09:16.800 P00   INFO: get 8 WAL file(s) from archive: 00000015000000F100000051...00000015000000F100000058
2025-09-08 12:09:17.094 P00   WARN: repo2: [ProtocolError] expected value '2.54.2' for greeting key 'version' but got '2.56.0'
                                    HINT: is the same version of pgBackRest installed on the local and remote host?
2025-09-08 12:09:20.939 P00   INFO: archive-get:async command end: completed successfully (4140ms)

Additional Information

  1. Issue on pgBackRest's end: https://github.com/pgbackrest/pgbackrest/issues/2680.
  2. Reverting to crunchydata/pgo:5.8.2 fixes the issue.
  3. PGO operator env variable:
     - name: RELATED_IMAGE_PGBACKREST
       value: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi9-2.56.0-2534
    
  4. The actual image that for the pgbackrest sidecar container: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi9-2.56.0-2534 (on ALL postgres pod instance)
  5. The actual image that for on the pgbackrest repo container: registry.developers.crunchydata.com/crunchydata/crunchy-pgbackrest:ubi9-2.56.0-2534

abelfodil avatar Sep 08 '25 21:09 abelfodil

@abelfodil you will need to update to the latest Postgres 17 image (ubi9-17.6-2534) to fix this error.

pgBackRest requires the version of pgBackRest to match across both the Postgres instance and the pgBackRest dedicated repo host. And since pgBackRest was updated to v2.56.0 for the CPK v5.8.3 release (see the Release Notes and the Components & Compatibility page for more info about dependency updates in this release), you will also need to update your Postgres image to align with the dedicated repo host Deployment (the alternative is to use an older version of pgBackRest for your pgBackRest dedicasted repo host Deployment - though it is recommended that you instead go to the latest versions of these images to ensure you have the latest patches, fixes, updates, etc.).

andrewlecuyer avatar Sep 08 '25 22:09 andrewlecuyer

Got it, well that fixed my issue! Thanks a lot Andrew!

Is there a way to emit a warning in case versions mismatch?

abelfodil avatar Sep 08 '25 23:09 abelfodil