datadog-operator icon indicating copy to clipboard operation
datadog-operator copied to clipboard

Connection refused while sending statsd metrics from container to ddoperator agent

Open albertferras-verse opened this issue 3 years ago • 1 comments

Output of the info page (if this is a bug) Container WARNING message ->

WARNING:datadog.dogstatsd:Error submitting packet: [Errno 111] Connection refused, dropping the packet and closing the socket

Environment variables DD_ENV, DD_SERVICE, DD_ENTITY_ID, DD_AGENT_HOST are correctly set. They are correct because APM works properly.

The content datadog-agent yaml is:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog
  namespace: datadog
spec:
  credentials:
    apiSecret:
      secretName: datadog-secret
      keyName: api-key
    appSecret:
      secretName: datadog-secret
      keyName: app-key
  agent:
    image:
      name: "gcr.io/datadoghq/agent:latest"
    apm:
      enabled: true
      hostPort: 8126
    log:
      enabled: true
    config:
      env:
        - name: "DD_DOGSTATSD_NON_LOCAL_TRAFFIC"
          value: "true"
  clusterAgent:
    image:
      name: "gcr.io/datadoghq/cluster-agent:latest"
    config:
      admissionController:
        enabled: true
        mutateUnlabelled: false

Describe what happened: Hello, we're trying to connect to the dogstatsd in the agents created by dd-operator but can't make it work. We're doing this in python with datadogpy library.

Describe what you expected: Be able to use datadogpy to push statsd metrics from the container when using datadog-operator. In the documentation https://docs.datadoghq.com/developers/dogstatsd/unix_socket/?tab=kubernetes it suggests to create a new datadog agent container on our service + use UDS with volumes etc... However, I expected that this isn't needed when using datadog-operator since we can just use the deployed agents Statsd that are already running. Is that possible/not prepared or are we doing something wrong?

Steps to reproduce the issue: See outputs of the info page

Additional environment details (Operating System, Cloud provider, etc): Latest DataDogOperator version, services deployed with k8s+Helm on GoogleCloudPlatform.

additional notes: I entered a dd-agent pod and checked echo -n "custom.metric.name:1|c" | nc -U -u -w1 /var/run/datadog/statsd/statsd.sock that works (I can see this metric appearing on Datadog UI). However, on this same container, I've tried to send a metric from python with datadogpy and I didn't manage to send any metric at all, although no error message appeared.

albertferras-verse avatar Sep 05 '22 10:09 albertferras-verse

Finally managed to make it working, by not using UDS and move to UDP. I had to set apm.config.hostPort: 8125 to allow UDP traffic from other containers:

apiVersion: datadoghq.com/v1alpha1
kind: DatadogAgent
metadata:
  name: datadog
  namespace: datadog
spec:
  credentials:
    apiSecret:
      secretName: datadog-secret
      keyName: api-key
    appSecret:
      secretName: datadog-secret
      keyName: app-key
  agent:
    image:
      name: "gcr.io/datadoghq/agent:latest"
    apm:
      enabled: true
      hostPort: 8126
    log:
      enabled: true
    config:
      hostPort: 8125
      env:
        - name: "DD_DOGSTATSD_NON_LOCAL_TRAFFIC"
          value: "true"
  clusterAgent:
    image:
      name: "gcr.io/datadoghq/cluster-agent:latest"
    config:
      admissionController:
        enabled: true
        mutateUnlabelled: false

Could you add these instructions on the dogstatsd documentation at https://docs.datadoghq.com/developers/dogstatsd/?tab=kubernetes so that other people don't go through the same problem? (New tab Operator like in other pages). Thanks!

albertferras-verse avatar Sep 07 '22 08:09 albertferras-verse