postgres_exporter icon indicating copy to clipboard operation
postgres_exporter copied to clipboard

Exporter container log of replica instance always reports pg_replication_slots pq: recovery is in progress

Open dominik0711 opened this issue 2 years ago • 10 comments

What did you do? I have set up a crunchy postgres cluster on my OpenShift cluster with 1 master and 2 replica instances. exporter container is running as sidecar container. All the replicas logs the following error message:

ts=2023-11-16T13:00:00.362Z caller=namespace.go:236 level=info err="Error running query on database \"localhost:5432\": pg_replication_slots pq: recovery is in progress"
ts=2023-11-16T13:00:00.379Z caller=postgres_exporter.go:731 level=error err="queryNamespaceMappings returned 1 errors"

What did you expect to see? Replication slots on Replicas are always inactive and in recovery mode so I don't expect to see any errors here

What did you see instead? Under which circumstances?

All replicas reports the same messages listed here:

ts=2023-11-16T13:00:00.362Z caller=namespace.go:236 level=info err="Error running query on database \"localhost:5432\": pg_replication_slots pq: recovery is in progress"
ts=2023-11-16T13:00:00.379Z caller=postgres_exporter.go:731 level=error err="queryNamespaceMappings returned 1 errors"

Environment

OpenShift 4.11 on Azure

  • System information:

Linux 4.18.0-372.76.1.el8_6.x86_64 x86_64

  • postgres_exporter version:

postgres_exporter, version 0.10.1 (branch: HEAD, revision: 6cff384d7433bcb1104efe3b496cd27c0658eb09) build user: root@eb21848025d7 build date: 20220114-17:20:30 go version: go1.17.6 platform: linux/amd64

  • postgres_exporter flags:
        - name: CONFIG_DIR
          value: /opt/cpm/conf
        - name: POSTGRES_EXPORTER_PORT
          value: '9187'
        - name: PGBACKREST_INFO_THROTTLE_MINUTES
          value: '10'
        - name: PG_STAT_STATEMENTS_LIMIT
          value: '20'
        - name: PG_STAT_STATEMENTS_THROTTLE_MINUTES
          value: '-1'
        - name: EXPORTER_PG_HOST
          value: localhost
        - name: EXPORTER_PG_PORT
          value: '5432'
        - name: EXPORTER_PG_DATABASE
          value: postgres
        - name: EXPORTER_PG_USER
          value: ccp_monitoring
        - name: EXPORTER_PG_PASSWORD
          valueFrom:
            secretKeyRef:
              name: flexis-io-dev-scm-billing-monitoring
              key: password
  • PostgreSQL version:

psql (PostgreSQL) 13.6

  • Logs:
ts=2023-11-16T13:00:00.362Z caller=namespace.go:236 level=info err="Error running query on database \"localhost:5432\": pg_replication_slots pq: recovery is in progress"
ts=2023-11-16T13:00:00.379Z caller=postgres_exporter.go:731 level=error err="queryNamespaceMappings returned 1 errors"

dominik0711 avatar Nov 16 '23 13:11 dominik0711

It seems to me that pg_current_wal_lsn() function call caused this issue in queries.go

Other collectors use an idiom like: (case pg_is_in_recovery() when 't' then null else pg_current_wal_lsn() end) AS pg_current_wal_lsn, but not in this query. You can't call this function in PostgreSQL sending and receiving replication (this situationr happens in the "child" in parent-child-grandchid replication senario).

To avoid this error, fix this issue or --no-collector.replication_slot option might help.

heitatta avatar Jan 05 '24 07:01 heitatta

Same problem. PostgreSQL 14.8, postgres_exporter 0.15.0 And --no-collector.replication_slot does not fix this.

DJLebedev avatar Apr 02 '24 13:04 DJLebedev

I'm facing the same issue. PostgreSQL: 16.2.0 Exporter: postgres-exporter:v0.15.0

postgres-exporter ts=2024-05-14T02:00:02.451Z caller=namespace.go:236 level=info err="Error running query on database \"192.168.0.3:5432\": pg_replication_slots pq: recovery is in progress"
postgres-exporter ts=2024-05-14T02:00:02.451Z caller=postgres_exporter.go:682 level=error err="queryNamespaceMappings returned 1 errors"
postgres-exporter ts=2024-05-14T02:00:05.266Z caller=namespace.go:236 level=info err="Error running query on database \"192.168.0.2:5432\": pg_replication_slots pq: recovery is in progress"
postgres-exporter ts=2024-05-14T02:00:05.348Z caller=postgres_exporter.go:682 level=error err="queryNamespaceMappings returned 1 errors" 
postgres-exporter ts=2024-05-14T02:00:05.956Z caller=namespace.go:236 level=info err="Error running query on database \"192.168.0.2:5432\": pg_replication_slots pq: recovery is in progress"
postgres-exporter ts=2024-05-14T02:00:05.956Z caller=postgres_exporter.go:682 level=error err="queryNamespaceMappings returned 1 errors"
postgres-exporter ts=2024-05-14T02:00:08.350Z caller=namespace.go:236 level=info err="Error running query on database \"192.168.0.2:5432\": pg_replication_slots pq: recovery is in progress"

ihordyrman avatar May 14 '24 10:05 ihordyrman

Same on patroni cluster with pglogical to another cluster

dusatvoj avatar Jul 15 '24 22:07 dusatvoj

pg 13 exporter 0.15.0, how to fix it? @sysadmind / @SuperQ / @Sticksman /

tunbb avatar Nov 15 '24 01:11 tunbb

I also encountered this problem, but despite the errors, replication is going correctly. how to fix it?

agent-atlas avatar Nov 15 '24 20:11 agent-atlas

I'm experiencing similar issues.

cgmEdi avatar Mar 17 '25 12:03 cgmEdi

We have the same issue on a Patroni cluster with postgres 17

La0 avatar May 06 '25 14:05 La0

FWIW - me too. I am involved in an upgrade from PG14/pg-exporter 0.11 and assumed it would take care of this.. but it seems not?

Edit: oops, just noticed link above suggests this was fixed in 0.16...

camac2025 avatar May 16 '25 08:05 camac2025

I has the error with versions 0.11.0 and 0.15.0 using --no-collector.replication_slot, but It fixes upgrading to version 0.17.0.

vortegatorres avatar Jul 21 '25 21:07 vortegatorres