dd-trace-java icon indicating copy to clipboard operation
dd-trace-java copied to clipboard

statsd client not sending metrics if k8s datadog agent restarts

Open vinay-kosaraju opened this issue 3 years ago • 3 comments

We have the datadog-agent deployed in Kubernetes as a daemonset. Whenever the daemonset managed agent pods are terminated and new pods come up, the java agent fails to deliver metrics to the agent until the java application is restarted.

Our expectation is that the client should renegotiate the connection once the agent pods are back up and running

vinay-kosaraju avatar Jan 05 '22 15:01 vinay-kosaraju

Have you considered switching to Unix Domain Sockets?

https://docs.datadoghq.com/developers/dogstatsd/unix_socket/?tab=kubernetes

UDS has several benefits:

  • resistant to restarts (since it's not based on the agent IP which could change on restart)
  • better performance
  • finer grained security

Also if you use /var/run/datadog/dsd.socket as the mounted UDS path then the Java Tracer will automatically pick this up.

mcculls avatar Jan 05 '22 16:01 mcculls

Thanks, UDS works. Just curious, are there any other ways/ workarounds to resolve this?

vinay-kosaraju avatar Jan 10 '22 20:01 vinay-kosaraju

The only other workaround I know of is to have a cronjob that periodically cleans entries from the conntracks table, eg:

conntrack -D -p udp --dport 8125

If the connection issue is caused by stale conntracks entries then this should allow it to recover.

mcculls avatar Jan 28 '22 19:01 mcculls