pg_auto_failover
pg_auto_failover copied to clipboard
Primary certificates in the data directory are blown away after failover
If you've put your certificates and keys at the default locations inside the nodes' data directories (and pointed to them with pg_autoctl enable ssl
), then the primary's cert and key may be replaced with a random secondary's cert and key after failing over, since pg_rewind
and pg_basebackup
will both overwrite what's already there. Since the certificates are often matched to the host machines, this is rarely what's wanted IMHO.
This can be easily worked around (just put the cert/key outside of the data directory), but that doesn't help after you've hit it, and the problem can go unnoticed for a while if the secondaries aren't contacted frequently.
The pg_rewind
docs have this to say:
pg_rewind will fail immediately if it finds files it cannot write directly to. This can happen for example when the source and the target server use the same file mapping for read-only SSL keys and certificates. If such files are present on the target server it is recommended to remove them before running pg_rewind. After doing the rewind, some of those files may have been copied from the source, in which case it may be necessary to remove the data copied and restore back the set of links used before the rewind.
Since pg_auto_failover is keeping track of where the certificate and key are stored, it'd be nice if the original copies were moved somewhere safe before the rewind/basebackup, and replaced afterwards. This would also allow those files to be marked read-only without interfering with a rewind.
Repro (tested on 1.4.1 and 1.5.2)
- Create a two-node system.
- Create two cert/key pairs, one for the primary and one for the secondary, and place them in the data directories.
- Point each node to its cert/key pair using
pg_autoctl enable ssl
. - Fail over to the secondary, in a way that requires the old primary to rewind. (On my machine, a normal switchover is usually orderly enough that
pg_rewind
reportsno rewind required
, and if there's no rewind then you won't hit this bug. I have more luck manually killing thepostgres
process.) - Bring the failed primary node back up and notice that it now has a copy of the other node's cert/key.
See also #334, which I think we should pay attention to while fixing this one.