postgres_exporter icon indicating copy to clipboard operation
postgres_exporter copied to clipboard

pg_stat_activity parsing errors for example queries.yaml

Open stephan2012 opened this issue 4 years ago • 3 comments

Running the postgres_exporter with the current queries.yaml from this repo against a Postgres 11 raises parsing errors:

time="2021-01-28T17:10:31Z" level=info msg="Established new database connection to \"127.0.0.1:5432\"." source="postgres_exporter.go:898"
time="2021-01-28T17:10:32Z" level=info msg="Established new database connection to \"127.0.0.1:5432\"." source="postgres_exporter.go:898"
time="2021-01-28T17:10:34Z" level=info msg="Established new database connection to \"127.0.0.1:5432\"." source="postgres_exporter.go:898"
time="2021-01-28T17:10:37Z" level=error msg="Error opening connection to database (postgresql://postgres:[email protected]:5432?sslmode=require): dial tcp 127.0.0.1:5432: connect: connection refused" source="postgres_exporter.go:1494"
time="2021-01-28T17:10:37Z" level=info msg="Starting Server: :9187" source="postgres_exporter.go:1699"
time="2021-01-28T17:10:53Z" level=info msg="Established new database connection to \"127.0.0.1:5432\"." source="postgres_exporter.go:898"
time="2021-01-28T17:10:53Z" level=info msg="Semantic Version Changed on \"127.0.0.1:5432\": 0.0.0 -> 11.10.0" source="postgres_exporter.go:1425"
time="2021-01-28T17:10:53Z" level=info msg="Could not parse []byte: strconv.ParseFloat: parsing \"{0,0,1,1,1,1,1,1,1}\": invalid syntax" source="postgres_exporter.go:752"
time="2021-01-28T17:10:53Z" level=info msg="Unparseable column type - discarding:  pg_stat_activity process_idle_seconds_bucket <nil>\n" source="postgres_exporter.go:1381"
time="2021-01-28T17:11:23Z" level=info msg="Could not parse []byte: strconv.ParseFloat: parsing \"{0,0,1,1,1,1,1,1,1}\": invalid syntax" source="postgres_exporter.go:752"
time="2021-01-28T17:11:23Z" level=info msg="Unparseable column type - discarding:  pg_stat_activity process_idle_seconds_bucket <nil>\n" source="postgres_exporter.go:1381"

This is the result of running the SQL statement:

$ psql
psql (13.1 (Ubuntu 13.1-1.pgdg18.04+1), server 11.10 (Ubuntu 11.10-1.pgdg18.04+1))
Type "help" for help.

postgres=# WITH
      metrics AS (
        SELECT
          application_name,
          SUM(EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - state_change))::bigint)::float AS process_idle_seconds_sum,
          COUNT(*) AS process_idle_seconds_count
        FROM pg_stat_activity
        WHERE state = 'idle'
        GROUP BY application_name
      ),
      buckets AS (
        SELECT
          application_name,
          le,
          SUM(
            CASE WHEN EXTRACT(EPOCH FROM (CURRENT_TIMESTAMP - state_change)) <= le
              THEN 1
              ELSE 0
            END
          )::bigint AS bucket
        FROM
          pg_stat_activity,
          UNNEST(ARRAY[1, 2, 5, 15, 30, 60, 90, 120, 300]) AS le
        GROUP BY application_name, le
        ORDER BY application_name, le
      )
    SELECT
      application_name,
      process_idle_seconds_sum,
      process_idle_seconds_count,
      ARRAY_AGG(le) AS process_idle_seconds,
      ARRAY_AGG(bucket) AS process_idle_seconds_bucket
    FROM metrics JOIN buckets USING (application_name)
    GROUP BY 1, 2, 3
;
 application_name  | process_idle_seconds_sum | process_idle_seconds_count |    process_idle_seconds     | process_idle_seconds_bucket 
-------------------+--------------------------+----------------------------+-----------------------------+-----------------------------
                   |                       17 |                          1 | {1,2,5,15,30,60,90,120,300} | {0,0,0,0,1,1,1,1,1}
 Patroni           |                        1 |                          1 | {1,2,5,15,30,60,90,120,300} | {0,1,1,1,1,1,1,1,1}
 pg_cron scheduler |                     4583 |                          1 | {1,2,5,15,30,60,90,120,300} | {0,0,0,0,0,0,0,0,0}

stephan2012 avatar Jan 28 '21 18:01 stephan2012

Hello,

I got the same error, after some investigation this is mostly due to the fact that you are using the latest v0.8.0 release that was cut in late 2019 while the commit that allows to define HISTOGRAM metrics was added in late 2020. I cut a "fake" 0.9.0 release in my fork if you want to try, it solved the problem on my side.

StephenSorriaux avatar Feb 08 '21 20:02 StephenSorriaux

@StephenSorriaux No, I am using the latest available Docker image and added queries.yaml to the container image.

Meanwhile I am wondering why the metric replay_lag from pg_stat_replication is discarded. Is it simply because nobody has implemented handling durations or just not suitable for technical reasons?

stephan2012 avatar Feb 09 '21 10:02 stephan2012

@StephenSorriaux No, I am using the latest available Docker image and added queries.yaml to the container image.

Meanwhile I am wondering why the metric replay_lag from pg_stat_replication is discarded. Is it simply because nobody has implemented handling durations or just not suitable for technical reasons?

yes, i wonder the same, why so many fields are set to DISCARD for the pg_stat_replication table. Thanks for any info on this.

freddielunchbird avatar Aug 02 '22 09:08 freddielunchbird