dd-trace-java
dd-trace-java copied to clipboard
statsd client not sending metrics if k8s datadog agent restarts
We have the datadog-agent
deployed in Kubernetes as a daemonset
. Whenever the daemonset managed agent pods are terminated and new pods come up, the java agent fails to deliver metrics to the agent until the java application is restarted.
Our expectation is that the client should renegotiate the connection once the agent pods are back up and running
Have you considered switching to Unix Domain Sockets?
https://docs.datadoghq.com/developers/dogstatsd/unix_socket/?tab=kubernetes
UDS has several benefits:
- resistant to restarts (since it's not based on the agent IP which could change on restart)
- better performance
- finer grained security
Also if you use /var/run/datadog/dsd.socket
as the mounted UDS path then the Java Tracer will automatically pick this up.
Thanks, UDS works. Just curious, are there any other ways/ workarounds to resolve this?
The only other workaround I know of is to have a cronjob that periodically cleans entries from the conntracks table, eg:
conntrack -D -p udp --dport 8125
If the connection issue is caused by stale conntracks entries then this should allow it to recover.