check_postgres
check_postgres copied to clipboard
Need standby lag check that doesn't require connection to primary
I'd like to add a new check that does standby lag based just on pg_last_xact_replay_timestamp() on the standby, without having to connect to the primary. Something akin to:
SELECT EXTRACT(EPOCH FROM (now() - pg_last_xact_replay_timestamp()))::INT;
My reason for this is that our cross-datacenter traffic is blocked except for replication traffic, so our icinga server in one can't talk to a DB in the other. So if we could just get a check like this added, it would help. For now I'm doing this exact query in a custom_query check but I think more people would find it helpful.
I'll try to get a PR up for it later this week.
The problem with that query is that if there is no write activity on the primary, the pg_last_xact_replay_timestamp() doesn't advance, and then the reported value grows without bounds.
Would setting the archive_timeout parameter on the primary help to mitigate that concern? If so then I'd suggest adding a note to that point in the README and/or help output.
I'm not sure, but even if that worked, it would require using archiving, which would be a problematic requirement.