integrations-core
integrations-core copied to clipboard
Datadog agent crashes when checking postgres database that does not exist
When setting up a postgres check for a database that does not yet exist, the agent crashes repeatedly:
agent 2023-04-14 14:24:50 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:132 in LogMessage) | postgres:6d5d070a9ae3d220 | (statement_samples.py:501) | cannot collect execution plans due to invalid schema in dbname=db: InvalidSchemaName('schema "datadog" does not exist\nLINE 1: SELECT datadog.explain_statement($stmt$SELECT * FROM pg_stat...\n ^\n')
agent 2023-04-14 14:24:51 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | postgres:b5594c11eb3e0395 | (tracking.py:84) | operation _run_explain error
agent Traceback (most recent call last):
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/tracking.py", line 71, in wrapper
agent result = function(self, *args, **kwargs)
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/statement_samples.py", line 553, in _run_explain
agent cursor.execute(
agent psycopg2.errors.InvalidSchemaName: schema "datadog" does not exist
agent LINE 1: SELECT datadog.explain_statement($stmt$SELECT * FROM pg_stat...
agent ^
agent 2023-04-14 14:24:51 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:132 in LogMessage) | postgres:b5594c11eb3e0395 | (statement_samples.py:501) | cannot collect execution plans due to invalid schema in dbname=db: InvalidSchemaName('schema "datadog" does not exist\nLINE 1: SELECT datadog.explain_statement($stmt$SELECT * FROM pg_stat...\n ^\n')
trace-agent 2023-04-14 14:24:52 UTC | TRACE | INFO | (run.go:253 in Info) | No data received
agent 2023-04-14 14:24:54 UTC | CORE | ERROR | (pkg/collector/python/datadog_agent.go:130 in LogMessage) | postgres:ea5ed6df20a07826 | (postgres.py:654) | Unable to collect postgres metrics.
agent Traceback (most recent call last):
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py", line 635, in check
agent self._connect()
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py", line 464, in _connect
agent self.db = self._new_connection(self._config.dbname)
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py", line 448, in _new_connection
agent conn = psycopg2.connect(**args)
agent File "/opt/datadog-agent/embedded/lib/python3.8/site-packages/psycopg2/__init__.py", line 127, in connect
agent conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
agent psycopg2.OperationalError: FATAL: database "<db_name>" does not exist
agent 2023-04-14 14:24:54 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:postgres | Error running check: [{"message": "FATAL: database \"db_name\" does not exist\n", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1122, in run\n self.check(instance)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py\", line 667, in check\n raise e\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py\", line 635, in check\n self._connect()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py\", line 464, in _connect\n self.db = self._new_connection(self._config.dbname)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/postgres.py\", line 448, in _new_connection\n conn = psycopg2.connect(**args)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/psycopg2/__init__.py\", line 127, in connect\n conn = _connect(dsn, connection_factory=connection_factory, **kwasync)\npsycopg2.OperationalError: FATAL: database \"<db_name>" does not exist\n\n"}]
As we're in the process of deploying a new service, the infrastructure to support it was stood up before the service was deployed, resulting in the agent crashing repeatedly
- EKS kubernetes 1.24
- helm chart 5.4.2
clusterAgent:
confd:
postgres.yaml:
rds_host_address: <some address>
rds_host_port: 5432
rds_datadog_username: username
rds_datadog_password: password
db_names:
- <non-existent-db-name>
Steps to reproduce the issue:
- Deploy postgres integration pointing to database that does not exist
- Agent crashes
We are seeing the same issue
I got a somewhat similar issue about the schema "datadog" not existing, I don't have the database not existing issue though.
2023-05-22T15:42:01.570742+00:00 app[worker_default.1]: Traceback (most recent call last):
2023-05-22T15:42:01.570744+00:00 app[worker_default.1]: File "/app/.apt/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/utils/tracking.py", line 71, in wrapper
2023-05-22T15:42:01.570745+00:00 app[worker_default.1]: result = function(self, *args, **kwargs)
2023-05-22T15:42:01.570749+00:00 app[worker_default.1]: File "/app/.apt/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/postgres/statement_samples.py", line 566, in _run_explain
2023-05-22T15:42:01.570750+00:00 app[worker_default.1]: cursor.execute(
2023-05-22T15:42:01.570751+00:00 app[worker_default.1]: psycopg2.errors.InvalidSchemaName: schema "datadog" does not exist
2023-05-22T15:42:01.570751+00:00 app[worker_default.1]: LINE 1: SELECT datadog.explain_statement($stmt$SELECT * FROM pg_stat...
2023-05-22T15:42:01.570751+00:00 app[worker_default.1]: ^
2023-05-22T15:42:01.571244+00:00 app[worker_default.1]: 2023-05-22 15:42:01 UTC | CORE | WARN | (pkg/collector/python/datadog_agent.go:132 in LogMessage) | postgres:bfcac488e9665929 | (statement_samples.py:514) | cannot collect execution plans due to invalid schema in dbname=tsadmin: InvalidSchemaName('schema "datadog" does not exist\nLINE 1: SELECT datadog.explain_statement($stmt$SELECT * FROM pg_stat...\n ^\n')
Still, running this PSQL command works and give a result :
psql -h <HOST> -p <PORT> -U datadog <DB_NAME> -A \
-c "select datadog.explain_statement(SELECT * FROM pg_stat_activity);"
In my case, I'm running a Python app on Heroku with the buildpack and enabled the postgres integration, that's it basically.
Is there any solution, or workaround?