Failed node never marked inactive in Repmgr schema
I am trying to create a monitoring solution for Repmgr-backed PostgreSQL clusters. As I didn't find any existing solution, I took to exposing Prometheus-compatible metrics using a generic SQL exporter.
One of those metrics is exposing whether a node is active or not.
However, even if I forcefully kill one of the 3 nodes (Primary, Replica, Witness cluster), it is never marked as Inactive in the repmgr.nodes table in the Primary's database.
The repmgr log does mention it disconnecting: [NOTICE] standby node "repmgr-test-1" (ID: 1) has disconnected
And running either a repmgr cluster show or repmgr service status does report it as being unreachable:
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | Connection string
----+---------------+---------+---------------+-----------------+----------+----------+----------+------------------------------------------------------------------------------------------
1 | repmgr-test-1 | standby | ? unreachable | ? repmgr-test-2 | default | 100 | | host=repmgr-test-1 port=5432 user=repmgr dbname=repmgr connect_timeout=2
2 | repmgr-test-2 | primary | * running | | default | 100 | 4 | host=repmgr-test-2 port=5432 user=repmgr dbname=repmgr connect_timeout=2
3 | repmgr-test-3 | witness | * running | repmgr-test-2 | default | 0 | n/a | host=repmgr-test-3 port=5432 user=repmgr dbname=repmgr connect_timeout=2
WARNING: following issues were detected
- unable to connect to node "repmgr-test-1" (ID: 1)
- node "repmgr-test-1" (ID: 1) is registered as an active standby but is unreachable
But judging by the fact that it takes about 3 seconds to report this state, the command has to be performing some sort of external check, instead of relying just on the repmgr database records.
Is this intended behavior? And if so, is there then any reliable way of finding out if a given node is healthy?
My setup is not actually ingesting and replicating any data. I mean to only test the functionality and reporting of the cluster state.
OS: Debian 12 Bookworm Repmgr version: 5.5.0 PostgreSQL version: 17.5