Postgres autodiscovery settings do not exclude databases for query samples
We are using AlloyDB, which has a database like the auto-excluded cloudsqladmin, but called alloydbadmin (there's also an alloydbmetadata DB).
We have this config:
collect_bloat_metrics: true
database_autodiscovery:
enabled: true
exclude:
- alloydbadmin
- alloydbmetadata
include:
- REDACTED # But they are literal DB names, no regex metacharacters
dbm: true
empty_default_hostname: true
host: REDACTED
max_relations: 1000
password: REDACTED
port: 5432
query_samples:
explain_parameterized_queries: true
relations:
- relation_regex: .*
schemas:
- public
reported_hostname: REDACTED
ssl: require
tags:
- REDACTED
username: datadog
(I don't think I should need the exclude at all, but I added it in an attempt to debug this.)
When we run checks, the datadog user attempts to connect to the alloydbadmin and alloydbmetadata DBs and fails, as it doesn't have access. Are we trying to exclude this in the wrong way?
Logs look like this:
2024-10-07 13:24:14 UTC | CORE | DEBUG | (pkg/collector/python/datadog_agent.go:133 in LogMessage) | postgres:672cac915837c236 | (statement_samples.py:540) | using cached explain_setup_state for DB 'alloydbadmin': DBExplainError.failed_function
2024-10-07 13:46:10 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:129 in LogMessage) | postgres:da54d250a2fa72fe | (postgres.py:982) | Unable to collect execution plans in dbname=alloydbadmin. Check that the function datadog.explain_statement exists in the database. See https://docs.datadoghq.com/database_monitoring/setup_postgres/troubleshooting#undefined-explain-function for more details: connection to server at "REDACTED", port 5432 failed: FATAL: pg_hba.conf rejects connection for host "REDACTED", user "datadog", database "alloydbadmin", SSL encryption
I see there are also exclusions for schema collection (https://github.com/DataDog/integrations-core/pull/18145), but nothing for statement collection.
It might be nice to have one config setting that means the agent will never attempt to connect to or report anything about a particular DB, rather than in several places?
Support confirmed that we also need to set ignore_databases: https://github.com/DataDog/integrations-core/blob/master/postgres/datadog_checks/postgres/data/conf.yaml.example#L74-L84
I do think this is confusing, especially the difference in terminology (exclude vs ignore), but I appreciate that will be hard to change in a non-breaking way.
@smcgivern similar to cloudsqladmin, we are looking into adding alloydbadmin and alloydbmetadata to the default exclusion list.
This should land in Agent 7.61! I'll close the issue because of the imminent fix.