check_postgres icon indicating copy to clipboard operation
check_postgres copied to clipboard

archive_ready does not work on PostgreSQL 10+

Open moench-tegeder opened this issue 6 years ago • 8 comments

check_archive_ready() calls check_wal_files() https://github.com/bucardo/check_postgres/blob/fd9702c9321a7e68965fa07e75d4f38e4de88d9c/check_postgres.pl#L3826 That's fine on PostgreSQL < 10, but on 10+ check_wal_files() calls pg_ls_wal_dir(), which lists only files in the wal dir, but not it's subdirectories. https://github.com/bucardo/check_postgres/blob/fd9702c9321a7e68965fa07e75d4f38e4de88d9c/check_postgres.pl#L8700 Thus the WHERE clause will never be true, and COUNT() always returns 0. I've currently no idea how to fix this in an elegant way: of course one could fall back to the "old" query, but that requires SUPERUSER privileges for pg_ls_dir() (or GRANTs on that, or a wrapper...).

moench-tegeder avatar Sep 19 '18 08:09 moench-tegeder

FWIW, this was fixed in 8e6b3c2514a1fdcca63f0f879e8731efd9cd7bd0 but there has been no release yet.

mbanck avatar Oct 17 '18 09:10 mbanck

Um, no - I rather think it was broken by that commit. With the new version and PostgreSQL10, check_wal_files() ignores the $subdir parameter (where "/archive_status" is passed in) and just uses pg_ls_waldir() - which does not list the contents of the archive_status directory, and the WHERE clause ("WHERE name ~ E'^[0-9A-F]{24}$extrabit$'" whith $extrabit being passed as ".ready") cannot succeed. There are no .ready files in the output of pg_ls_waldir().

moench-tegeder avatar Oct 17 '18 09:10 moench-tegeder

Oh I see - that is indeed the case.

pinging @ChristophBerg as he committed that.

mbanck avatar Oct 17 '18 10:10 mbanck

It seems archive_ready does not work for the current version of postgres (13) and always finds 0 ready files. is there some workaround to get it running?

ChristophKaser avatar Jun 17 '21 08:06 ChristophKaser

Oh I see - that is indeed the case.

pinging @ChristophBerg as he committed that.

I think the person to ping was @df7cb

ChristophKaser avatar Jun 17 '21 08:06 ChristophKaser

Looks like this was fixed as of 1e06b0d8754ec0953d7f4976e3d8dff125718df4

turnstep avatar Apr 04 '23 14:04 turnstep

@turnstep how so, that commit (https://github.com/bucardo/check_postgres/commit/1e06b0d8754ec0953d7f4976e3d8dff125718df4) is from 2015 - maybe it makes it possible to succesfully run check_archive_ready if one sets up an elaborate helper function (I did not check), but fact is that this check currently just reports OK on PG10+ as it does not check in pg_wal/archive_status but in pg_wal directly, please reopen.

postgres@db01:~$ touch /var/lib/postgresql/13/main/pg_wal/archive_status/000000010000000000000001.ready
postgres@db01:~$ touch /var/lib/postgresql/13/main/pg_wal/archive_status/000000010000000000000002.ready
postgres@db01:~$ LANG=C check_postgres -w 1 -c 1 --action archive_ready -H /var/run/postgresql/
POSTGRES_ARCHIVE_READY OK: DB "postgres" (host:/var/run/postgresql/) WAL ".ready" files found: 0 | time=0.02s files=0;1;1 

mbanck avatar Oct 04 '23 15:10 mbanck

With the above PR, I get:

postgres@db01:~$ LANG=C /tmp/check_postgres -w 1 -c 1 --action archive_ready -H /var/run/postgresql/
POSTGRES_ARCHIVE_READY CRITICAL: DB "postgres" (host:/var/run/postgresql/) WAL ".ready" files found: 2 | time=0.02s files=2;1;1 

mbanck avatar Oct 04 '23 15:10 mbanck