repmgr
repmgr copied to clipboard
repmgr switchover fails with
Setup: ` ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+------+---------+----------------------+----------+----------+----------+----------+
1 | nvme | primary | * running | | default | 100 | 5 | host=x.1 user=repmgr dbname=repmgr connect_timeout=2
2 | hdd | standby | running | 1 | default | 100 | 6 | host=x.2 user=repmgr dbname=repmgr connect_timeout=2
`
Steps to reproduce:
- Turn off Node 1 (nvme).
- Promote Node 2 (hdd): run on Node 2:
repmgr primary unregister --node-id 1
- Wipe Postgresql folder on Node 1 (this will happen due to the hardware setup).
- Clone from Node 2 to Node 1: running on Node 1:
repmgr -h x.2 -U repmgr -d repmgr --fast-checkpoint standby clone
- Switch the primary from Node 2 to Node 1: running on Node 1:
repmgr standby switchover
The last command fails with timeout of "waiting for received WAL to flush to disk"
Running the switchover command with "--always-promote" does not help, and leaves the cluster in a broken state with both Node 1 and Node 2 set as primary.
Any idea what's the cause of the error? Or how can a switchover be achieved?