check_postgres
check_postgres copied to clipboard
archive_ready does not work on PostgreSQL 10+
check_archive_ready() calls check_wal_files() https://github.com/bucardo/check_postgres/blob/fd9702c9321a7e68965fa07e75d4f38e4de88d9c/check_postgres.pl#L3826 That's fine on PostgreSQL < 10, but on 10+ check_wal_files() calls pg_ls_wal_dir(), which lists only files in the wal dir, but not it's subdirectories. https://github.com/bucardo/check_postgres/blob/fd9702c9321a7e68965fa07e75d4f38e4de88d9c/check_postgres.pl#L8700 Thus the WHERE clause will never be true, and COUNT() always returns 0. I've currently no idea how to fix this in an elegant way: of course one could fall back to the "old" query, but that requires SUPERUSER privileges for pg_ls_dir() (or GRANTs on that, or a wrapper...).
FWIW, this was fixed in 8e6b3c2514a1fdcca63f0f879e8731efd9cd7bd0 but there has been no release yet.
Um, no - I rather think it was broken by that commit. With the new version and PostgreSQL10, check_wal_files() ignores the $subdir parameter (where "/archive_status" is passed in) and just uses pg_ls_waldir() - which does not list the contents of the archive_status directory, and the WHERE clause ("WHERE name ~ E'^[0-9A-F]{24}$extrabit$'" whith $extrabit being passed as ".ready") cannot succeed. There are no .ready files in the output of pg_ls_waldir().
Oh I see - that is indeed the case.
pinging @ChristophBerg as he committed that.
It seems archive_ready does not work for the current version of postgres (13) and always finds 0 ready files. is there some workaround to get it running?
Oh I see - that is indeed the case.
pinging @ChristophBerg as he committed that.
I think the person to ping was @df7cb
Looks like this was fixed as of 1e06b0d8754ec0953d7f4976e3d8dff125718df4
@turnstep how so, that commit (https://github.com/bucardo/check_postgres/commit/1e06b0d8754ec0953d7f4976e3d8dff125718df4) is from 2015 - maybe it makes it possible to succesfully run check_archive_ready
if one sets up an elaborate helper function (I did not check), but fact is that this check currently just reports OK
on PG10+ as it does not check in pg_wal/archive_status
but in pg_wal
directly, please reopen.
postgres@db01:~$ touch /var/lib/postgresql/13/main/pg_wal/archive_status/000000010000000000000001.ready
postgres@db01:~$ touch /var/lib/postgresql/13/main/pg_wal/archive_status/000000010000000000000002.ready
postgres@db01:~$ LANG=C check_postgres -w 1 -c 1 --action archive_ready -H /var/run/postgresql/
POSTGRES_ARCHIVE_READY OK: DB "postgres" (host:/var/run/postgresql/) WAL ".ready" files found: 0 | time=0.02s files=0;1;1
With the above PR, I get:
postgres@db01:~$ LANG=C /tmp/check_postgres -w 1 -c 1 --action archive_ready -H /var/run/postgresql/
POSTGRES_ARCHIVE_READY CRITICAL: DB "postgres" (host:/var/run/postgresql/) WAL ".ready" files found: 2 | time=0.02s files=2;1;1