datadog-agent
datadog-agent copied to clipboard
[BUG] Datadog container erroring out trying to connect to some redis server
Agent Environment
- Agent version 7
- Environment - docker
Describe what happened
I am running the datadog container as one of the services in docker compose. I am running Agent: 7 for my purposes.
version: "3.9"
services:
app:
image: app
container_name: app
hostname: app
build:
context: .
dockerfile: Dockerfile
restart: unless-stopped
ports:
- 8080:80
volumes:
- shared_volume:/tmp/logs
datadog:
container_name: dd-agent
image: gcr.io/datadoghq/agent:7
restart: always
ports:
- 8125:8125/udp
- 8126:8126
environment:
- DD_API_KEY=${DATADOG_API_KEY}
- DD_SITE=${DD_SITE}
- DD_DOGSTATSD_NON_LOCAL_TRAFFIC=${DD_DOGSTATSD_NON_LOCAL_TRAFFIC}
- DD_LOGS_ENABLED="true"
- DD_APM_ENABLED="true"
- DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL="true"
- DD_CONTAINER_EXCLUDE_LOGS="name:dd-agent"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /proc/:/host/proc/:ro
# - /opt/dd-agent/run:/opt/dd-agent/run:rw
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
volumes:
shared_volume:
However running the datadog container runs into an error. The error log says that it's trying to connect to a redis server. I am not sure where is this coming from, as I don't recollect redis being one of the dependencies for datadog.
Pasted same log below for convenience -
dd-agent | 2022-10-11 10:13:53 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:php_fpm | Error running check: [{"message": "Detected 1 error while loading configuration model `InstanceConfig`:\n__root__\n Field `status_url` or `ping_url` must be set", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1091, in run\n initialization()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 492, in load_configuration_models\n instance_config = self.load_configuration_model(package_path, 'InstanceConfig', raw_instance_config)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 536, in load_configuration_model\n raise_from(ConfigurationError('\\n'.join(message_lines)), None)\n File \"<string>\", line 3, in raise_from\ndatadog_checks.base.errors.ConfigurationError: Detected 1 error while loading configuration model `InstanceConfig`:\n__root__\n Field `status_url` or `ping_url` must be set\n"}]
dd-agent | 2022-10-11 10:13:57 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:redisdb | Error running check: [{"message": "Timeout connecting to server", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 611, in connect\n sock = self.retry.call_with_retry(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 51, in call_with_retry\n raise error\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 46, in call_with_retry\n return do()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 612, in <lambda>\n lambda: self._connect(), lambda error: self.disconnect(error)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 677, in _connect\n raise err\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 665, in _connect\n sock.connect(socket_address)\nsocket.timeout: timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1116, in run\n self.check(instance)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 556, in check\n self._check_db()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 205, in _check_db\n info = conn.info()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/commands/core.py\", line 970, in info\n return self.execute_command(\"INFO\", **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/client.py\", line 1235, in execute_command\n conn = self.connection or pool.get_connection(command_name, **options)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 1387, in get_connection\n connection.connect()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 615, in connect\n raise TimeoutError(\"Timeout connecting to server\")\nredis.exceptions.TimeoutError: Timeout connecting to server\n"}]
Describe what you expected
For the container to boot up without any issues 🤷♂️
Steps to reproduce the issue
Just start a datadog container. It would fail
Additional environment details (Operating System, Cloud provider, etc)
- Local setup (Macbook Pro M1)
- Running as docker using docker-compose
This looks like it's trying to run the redis integration and failing due to some misconfiguration, are you running any redis containers in docker?
Nope, not at all
Simple docker run without passing any configuration, fails as well
Could you provide the output of agent configcheck and agent status? These are commands you can run inside the agent container that will provide more data about what the agent has detected and is trying to run.
These two logs you have posted are trying to run two checks, one called php_fpm and one called redisdb. The first command should provide data about where the configuration for these checks are coming from.
I think I had a stale redis container, which the datadog was trying to track as well. I modified my datadog service to explicitly track only a selected container & ignore the rest -
datadog:
container_name: dd-agent
image: gcr.io/datadoghq/agent:7
restart: always
ports:
- 8125:8125/udp
- 8126:8126
environment:
- DD_API_KEY=${DATADOG_API_KEY}
- DD_SITE=${DD_SITE}
- DD_DOGSTATSD_NON_LOCAL_TRAFFIC=true
- DD_LOGS_ENABLED=true
- DD_APM_ENABLED=true
- DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL=true
# exclude all containers from autodiscovery
- DD_CONTAINER_EXCLUDE = "name:.*"
# track only below containers
- DD_CONTAINER_INCLUDE="name:my_application"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /proc/:/host/proc/:ro
# - /opt/dd-agent/run:/opt/dd-agent/run:rw
- /sys/fs/cgroup/:/host/sys/fs/cgroup:ro
Is there any update for this error? I installed the agent exactly as per the following guide, but I am still getting the error below.
docker command:
export DD_API_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
export DD_AGENT_VERSION=7.36.1
docker run -e "DD_API_KEY=${DD_API_KEY}" \
-v /var/run/docker.sock:/var/run/docker.sock:ro \
-l com.datadoghq.ad.check_names='["mysql"]' \
-l com.datadoghq.ad.init_configs='[{}]' \
-l com.datadoghq.ad.instances='[{
"dbm": true,
"host": "<AWS_INSTANCE_ENDPOINT>",
"port": 3306,
"username": "datadog",
"password": "<UNIQUEPASSWORD>"
}]' \
gcr.io/datadoghq/agent:${DD_AGENT_VERSION}
errors:
2023-08-08 01:15:13 UTC | TRACE | INFO | (run.go:243 in Info) | No data received
2023-08-08 01:15:15 UTC | CORE | ERROR | (pkg/collector/worker/check_logger.go:69 in Error) | check:redisdb | Error running check: [{"message": "Timeout connecting to server", "traceback": "Traceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 614, in connect\n sock = self.retry.call_with_retry(\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 50, in call_with_retry\n raise error\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/retry.py\", line 45, in call_with_retry\n return do()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 615, in <lambda>\n lambda: self._connect(), lambda error: self.disconnect(error)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 680, in _connect\n raise err\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 668, in _connect\n sock.connect(socket_address)\nsocket.timeout: timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/base/checks/base.py\", line 1120, in run\n self.check(instance)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 552, in check\n self._check_db()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/datadog_checks/redisdb/redisdb.py\", line 203, in _check_db\n info = conn.info()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/commands/core.py\", line 900, in info\n return self.execute_command(\"INFO\", **kwargs)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/client.py\", line 1192, in execute_command\n conn = self.connection or pool.get_connection(command_name, **options)\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 1386, in get_connection\n connection.connect()\n File \"/opt/datadog-agent/embedded/lib/python3.8/site-packages/redis/connection.py\", line 618, in connect\n raise TimeoutError(\"Timeout connecting to server\")\nredis.exceptions.TimeoutError: Timeout connecting to server\n"}]
I am also getting the same error
