pg_auto_failover
pg_auto_failover copied to clipboard
Set the pgautofailover_monitor user's password to NULL.
We don't actually need to have a password for this user, given that the monitor doesn't authenticate when doing health checks. Not having a password to manage also makes it easier to by compliant with password storage policies of our users (e.g. md5 vs scram-sha-256).
Fixes #763.
I don't think this change is good in the current form. If you have the password set to NULL and set pg_hba.conf to only allow scram-sha-256 (or md5). Then postgres logs on node1 are being flooded with the following FATAL errors (every second a new log):
See also https://github.com/citusdata/pg_auto_failover/pull/672 which has similar errors.
I just came across this because I was having a similar issue, in our setup script we configure a random password for the pgautofailover_monitor
user, in 1.4.2 we were not having issues but now we are seeing our logs full of entries like:
2021-10-11 21:53:35.527 UTC [120300] postgres [unknown] pgautofailover_monitor 10.128.0.3 6164b25f.1d5ec FATAL: password authentication failed for user "pgautofailover_monitor"
2021-10-11 21:53:35.527 UTC [120300] postgres [unknown] pgautofailover_monitor 10.128.0.3 6164b25f.1d5ec DETAIL: Connection matched pg_hba.conf line 105: "hostssl all pgautofailover_monitor 10.0.0.0/8 scram-sha-256"
2021-10-11 21:53:35.529 UTC [120301] postgres [unknown] pgautofailover_monitor 10.128.0.3 6164b25f.1d5ed FATAL: no pg_hba.conf entry for host "10.128.0.3", user "pgautofailover_monitor", database "postgres", SSL off
After finding this and setting up the password to the hardcoded value of pgautofailover_monitor
that error stopped and I saw this in the events:
2021-10-11 21:53:57.570284+00 | 0/7 | secondary | secondary | Node node 7 "node_7" (citus-andres-dev-coord-b-34m8.c.acme-qa01.internal:5432) is marked as healthy by the monitor
2021-10-11 21:54:23.882783+00 | 0/6 | primary | primary | Node node 6 "node_6" (citus-andres-dev-coord-a-z1x0.c.acme-qa01.internal:5432) is marked as healthy by the monitor
As well as seeing the nodes as read-write
or read-only
.
It would be great if this is better documented somewhere in the docs.
Thanks!