Postgres Restore failed
Environment
Please provide the following details:
- Platform: Kubernetes
- Platform Version: 1.23.7
- PGO Image Tag:ubi8-5.7.3
- Postgres Version 14
- Storage: nfs
Steps to Reproduce
I performed two backups using pgBackRest. The first backup was automatically completed after the cluster was created. I executed the restore operation by modifying the CR file. pgbackrest info
stanza: db
status: ok
cipher: none
db (current)
wal archive min/max (14): 000000010000000000000001/00000002000000000000000E
full backup: 20250310-023907F
timestamp start/stop: 2025-03-10 02:39:07+00 / 2025-03-10 02:40:15+00
wal start/stop: 000000010000000000000003 / 000000010000000000000004
database size: 52.3MB, database backup size: 52.3MB
repo1: backup set size: 6.7MB, backup size: 6.7MB
full backup: 20250310-030103F
timestamp start/stop: 2025-03-10 03:01:03+00 / 2025-03-10 03:01:46+00
wal start/stop: 000000010000000000000006 / 000000010000000000000007
database size: 52.4MB, database backup size: 52.4MB
repo1: backup set size: 6.7MB, backup size: 6.7MB
incr backup: 20250310-030103F_20250310-030518I
timestamp start/stop: 2025-03-10 03:05:18+00 / 2025-03-10 03:05:25+00
wal start/stop: 00000001000000000000000A / 00000001000000000000000B
database size: 52.4MB, database backup size: 4.0MB
repo1: backup set size: 6.7MB, backup size: 447.0KB
backup reference list: 20250310-030103F
The first restore was to the backup set 20250310-030103F_20250310-030518I, which completed normally and the data was intact. I then restored again to the first backup set 20250310-023907F, which also restored normally. However, when I tried to restore to the backup set 20250310-030103F_20250310-030518I once more, an error occurred. Upon checking the logs of the restore pod, I found the following error message.
kg logs pod/v9test-pgbackrest-restore-g6gq4 -f
+ pgbackrest restore --type=time '--target=2025-03-10 07:04:46' --stanza=db --pg1-path=/pgdata/pg14 --repo=1 --delta --target-action=promote --link-map=pg_wal=/pgdata/pg14_wal
2025-03-10 07:29:53.448 GMT [20] LOG: starting PostgreSQL 14.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.4.0, 64-bit
2025-03-10 07:29:53.463 GMT [20] LOG: listening on IPv6 address "::1", port 5866
2025-03-10 07:29:53.463 GMT [20] LOG: listening on IPv4 address "127.0.0.1", port 5866
2025-03-10 07:29:53.470 GMT [20] LOG: listening on IPv6 address "::1", port 1521
2025-03-10 07:29:53.470 GMT [20] LOG: listening on IPv4 address "127.0.0.1", port 1521
2025-03-10 07:29:53.472 GMT [20] LOG: listening on Unix socket "/tmp/.s.PGSQL.5866"
2025-03-10 07:29:53.485 GMT [20] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2025-03-10 07:29:53.530 GMT [20] LOG: This is an trial version of hgdb-enterprise-9.0.1, valid until 2099年10月25日. After-sales service hotline: 400-708-8006. Online technical support platform:https://support.highgo.com/.
2025-03-10 07:29:53.541 GMT [21] LOG: database system was interrupted; last known up at 2025-03-10 07:03:31 GMT
2025-03-10 07:30:07.674 GMT [21] LOG: restored log file "00000003.history" from archive
2025-03-10 07:30:07.742 GMT [21] LOG: restored log file "00000004.history" from archive
2025-03-10 07:30:07.820 GMT [21] LOG: restored log file "00000005.history" from archive
2025-03-10 07:30:07.893 GMT [21] LOG: starting point-in-time recovery to 2025-03-10 07:04:46+00
2025-03-10 07:30:07.961 GMT [21] LOG: restored log file "00000005.history" from archive
2025-03-10 07:30:08.333 GMT [21] LOG: restored log file "00000004000000000000000D" from archive
2025-03-10 07:30:08.667 GMT [21] LOG: restored log file "00000004000000000000000C" from archive
2025-03-10 07:30:08.709 GMT [21] FATAL: requested timeline 5 is not a child of this server's history
2025-03-10 07:30:08.709 GMT [21] DETAIL: Latest checkpoint is at 0/D000060 on timeline 2, but in the history of the requested timeline, the server forked off from that timeline at 0/F005068.
2025-03-10 07:30:08.712 GMT [20] LOG: startup process (PID 21) exited with exit code 1
2025-03-10 07:30:08.712 GMT [20] LOG: aborting startup due to startup process failure
2025-03-10 07:30:08.716 GMT [20] LOG: database system is shut down
pg_ctl: could not start server
Examine the log output.
2025-03-10 07:30:09.013 GMT [39] LOG: starting PostgreSQL 14.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.4.0, 64-bit
2025-03-10 07:30:09.014 GMT [39] LOG: listening on IPv6 address "::1", port 5866
2025-03-10 07:30:09.014 GMT [39] LOG: listening on IPv4 address "127.0.0.1", port 5866
2025-03-10 07:30:09.022 GMT [39] LOG: listening on IPv6 address "::1", port 1521
2025-03-10 07:30:09.022 GMT [39] LOG: listening on IPv4 address "127.0.0.1", port 1521
2025-03-10 07:30:09.024 GMT [39] LOG: listening on Unix socket "/tmp/.s.PGSQL.5866"
2025-03-10 07:30:09.032 GMT [39] LOG: listening on Unix socket "/tmp/.s.PGSQL.1521"
2025-03-10 07:30:09.074 GMT [39] LOG: This is an trial version of hgdb-enterprise-9.0.1, valid until 2099年10月25日. After-sales service hotline: 400-708-8006. Online technical support platform:https://support.highgo.com/.
2025-03-10 07:30:09.086 GMT [40] LOG: database system was interrupted; last known up at 2025-03-10 07:03:31 GMT
2025-03-10 07:30:29.056 GMT [40] LOG: restored log file "00000003.history" from archive
2025-03-10 07:30:29.114 GMT [40] LOG: restored log file "00000004.history" from archive
2025-03-10 07:30:29.172 GMT [40] LOG: restored log file "00000005.history" from archive
2025-03-10 07:30:29.257 GMT [40] LOG: starting point-in-time recovery to 2025-03-10 07:04:46+00
2025-03-10 07:30:29.453 GMT [40] LOG: restored log file "00000005.history" from archive
2025-03-10 07:30:29.849 GMT [40] LOG: restored log file "00000004000000000000000D" from archive
2025-03-10 07:30:30.218 GMT [40] LOG: restored log file "00000004000000000000000C" from archive
2025-03-10 07:30:30.255 GMT [40] FATAL: requested timeline 5 is not a child of this server's history
2025-03-10 07:30:30.255 GMT [40] DETAIL: Latest checkpoint is at 0/D000060 on timeline 2, but in the history of the requested timeline, the server forked off from that timeline at 0/F005068.
2025-03-10 07:30:30.258 GMT [39] LOG: startup process (PID 40) exited with exit code 1
2025-03-10 07:30:30.258 GMT [39] LOG: aborting startup due to startup process failure
2025-03-10 07:30:30.262 GMT [39] LOG: database system is shut down
pg_ctl: could not start server
Examine the log output.
I modified the operator's code and added the following lines:
if foundTarget {
opts = append(opts, "--target-action=promote", "--target-timeline=current")
}
With the modified operator, the primary node can start normally, but the secondary node encounters the following error.
2025-03-10 07:42:35,011 WARNING: Postgresql is not running.
2025-03-10 07:42:35,012 INFO: Lock owner: v9test-highgo-vgwk-0; I am v9test-highgo-d4nx-0
2025-03-10 07:42:35,018 INFO: pg_controldata:
pg_control version number: 1300
Catalog version number: 202107181
Database system identifier: 7480068508330737760
Database cluster state: shut down in recovery
pg_control last modified: Mon Mar 10 07:21:41 2025
Latest checkpoint location: 0/D000060
Latest checkpoint's REDO location: 0/C000550
Latest checkpoint's REDO WAL file: 00000004000000000000000C
Latest checkpoint's TimeLineID: 4
Latest checkpoint's PrevTimeLineID: 4
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID: 0:818
Latest checkpoint's NextOID: 40959
Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 0
Latest checkpoint's oldestXID: 784
Latest checkpoint's oldestXID's DB: 1
Latest checkpoint's oldestActiveXID: 818
Latest checkpoint's oldestMultiXid: 1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid: 0
Latest checkpoint's newestCommitTsXid: 0
Time of latest checkpoint: Mon Mar 10 07:20:41 2025
Fake LSN counter for unlogged rels: 0/3E8
Minimum recovery ending location: 0/D000110
Min recovery ending loc's timeline: 4
Backup start location: 0/0
Backup end location: 0/0
End-of-backup record required: no
wal_level setting: logical
wal_log_hints setting: on
max_connections setting: 100
max_worker_processes setting: 8
max_wal_senders setting: 10
max_prepared_xacts setting: 0
max_prepared_foreign_transactions setting: 0
max_locks_per_xact setting: 64
track_commit_timestamp setting: off
Maximum data alignment: 8
Database block size: 8192
Blocks per segment of large relation: 131072
WAL block size: 8192
Bytes per WAL segment: 16777216
Maximum length of identifiers: 64
Maximum columns in an index: 32
Maximum size of a TOAST chunk: 1996
Size of a large-object chunk: 2048
Date/time type storage: 64-bit integers
Float8 argument passing: by value
Data page checksum version: 1
Mock authentication nonce: 0bb02f977b2417c0296e95e2684c288621ab04cd0ed34e291a08ac247de78475
database mode: 1
case conversion mode: 1
2025-03-10 07:42:35,019 INFO: Lock owner: v9test-highgo-vgwk-0; I am v9test-highgo-d4nx-0
2025-03-10 07:42:35,024 INFO: starting as a secondary
2025-03-10 07:42:36,053 INFO: postmaster pid=114
/tmp/postgres:5866 - no response
2025-03-10 07:42:36.174 UTC [114] LOG: pgaudit extension initialized
2025-03-10 07:42:36.364 UTC [114] LOG: starting PostgreSQL 14.13 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 10.4.0, 64-bit
2025-03-10 07:42:36.367 UTC [114] LOG: listening on IPv4 address "0.0.0.0", port 5866
2025-03-10 07:42:36.367 UTC [114] LOG: listening on IPv6 address "::", port 5866
2025-03-10 07:42:36.425 UTC [114] LOG: listening on IPv6 address "::1", port 1521
2025-03-10 07:42:36.425 UTC [114] LOG: listening on IPv4 address "127.0.0.1", port 1521
2025-03-10 07:42:36.445 UTC [114] LOG: listening on Unix socket "/tmp/postgres/.s.PGSQL.5866"
2025-03-10 07:42:36.483 UTC [114] LOG: listening on Unix socket "/tmp/postgres/.s.PGSQL.1521"
2025-03-10 07:42:36.788 UTC [114] LOG: This is an trial version of hgdb-enterprise-9.0.1, valid until 2099年10月25日. After-sales service hotline: 400-708-8006. Online technical support platform:https://support.highgo.com/.
2025-03-10 07:42:36.852 UTC [117] LOG: database system was shut down in recovery at 2025-03-10 07:21:41 UTC
2025-03-10 07:42:37.245 UTC [120] FATAL: the database system is starting up
/tmp/postgres:5866 - rejecting connections
2025-03-10 07:42:37.258 UTC [122] FATAL: the database system is starting up
/tmp/postgres:5866 - rejecting connections
2025-03-10 07:42:37.354 UTC [117] LOG: restored log file "00000005.history" from archive
2025-03-10 07:42:37.652 UTC [117] LOG: restored log file "00000006.history" from archive
2025-03-10 07:42:37.894 UTC [117] LOG: entering standby mode
2025-03-10 07:42:38.053 UTC [117] LOG: restored log file "00000006.history" from archive
2025-03-10 07:42:38.304 UTC [129] FATAL: the database system is starting up
/tmp/postgres:5866 - rejecting connections
2025-03-10 07:42:38.618 UTC [117] LOG: restored log file "00000002000000000000000D" from archive
2025-03-10 07:42:38.852 UTC [117] FATAL: requested timeline 6 does not contain minimum recovery point 0/D000110 on timeline 4
2025-03-10 07:42:38.865 UTC [114] LOG: startup process (PID 117) exited with exit code 1
2025-03-10 07:42:38.865 UTC [114] LOG: aborting startup due to startup process failure
2025-03-10 07:42:38.885 UTC [114] LOG: database system is shut down
/tmp/postgres:5866 - no response
Has anyone encountered this issue before? Is my code modification correct? Please help me with this.
I’m not sure whether this logic originates from PostgreSQL itself or from pgBackRest.
However, as seen in the logs:
2025-03-10 07:30:30.255 GMT [40] FATAL: requested timeline 5 is not a child of this server's history
requested timeline 6 does not contain minimum recovery point 0/D000110 on timeline 4
2025-03-10 07:42:38.865 UTC [114] LOG: startup process (PID 117) exited with exit code 1
Each time you perform a restore or failover, PostgreSQL generates a new xxx.history file. For example:
Latest checkpoint is at 0/D000060 on timeline 2, but in the history of the requested timeline, the server forked off from that timeline at 0/F005068.
In this case, you may need to explicitly specify the correct timeline during recovery using:
--target-timeline=2
Alternatively, you might consider removing the conflicting 00000006.history and 00000005.history files to prevent timeline mismatch errors during recovery.