Deadlocks during pg_auto_failover operations
Good afternoon.
Recently, on a test environment, I encountered a failure of pg_auto_failover to perform a switchover due to a deadlock:
postgres@postgres-db05 ~$ pg_autoctl perform switchover --formation vagrant
15:59:10 18413 ERROR Monitor ERROR: deadlock detected
15:59:10 18413 ERROR Monitor DETAIL: Process 18419 waits for ShareLock on transaction 3711; blocked by process 18418.
15:59:10 18413 ERROR Monitor Process 18418 waits for ExclusiveLock on advisory lock [16385,1338977919,0,11]; blocked by process 18419.
15:59:10 18413 ERROR Monitor HINT: See server log for query details.
15:59:10 18413 ERROR Monitor CONTEXT: while updating tuple (0,10) in relation "node"
15:59:10 18413 ERROR Monitor SQL statement "UPDATE pgautofailover.node SET goalstate = $1, statechangetime = now() WHERE nodeid = $2"
15:59:10 18413 ERROR SQL query: SELECT pgautofailover.perform_failover($1, $2)
15:59:10 18413 ERROR SQL params: 'vagrant', '0'
15:59:10 18413 ERROR Failed to perform failover for formation vagrant and group 0
15:59:10 18413 FATAL Failed to perform failover/switchover, see above for details
When executing the command again, immediately after the first one, the deadlock did not hit and the switchover was executed correctly.
In the PostgreSQL logs on the monitor, you can find messages about deadlocks when updating datanode statuses. Once in production, due to the fact that the status of the data node could not be updated on time on the monitor, it caused a failover, which in turn could not be completed, since the old primary built up in the status of demote_timout. A simple restart of the old primary helped us solve this problem. Unfortunately, I have no logs left for that situation. But the fact of the failover itself should not have been in that situation, at the beginning of the failover, the deadlock was to blame, despite the fact that this is a cluster with only two nodes and the monitor only serves it.
It would be desirable and correct for pg_auto_failover in such situations to be able to handle the deadlock event and repeat operations, whether it is updating the status of the datanode on the monitor or performing any operation caused by the administrator or pg_auto_failover himself.
Thank you.
@dimitri I would very much like to hear your comments...