dd-trace-js icon indicating copy to clipboard operation
dd-trace-js copied to clipboard

Add support for Unix Domain Sockets to DogStatsD client

Open maxenglander opened this issue 4 years ago • 9 comments

The DogStatsD client does not appear to support Unix Domain Sockets.

It would be great if this client could be extended to support Unix Domain Sockets.

Alternately, it would be good if dd-trace-js allowed the user to supply a custom DogStatsD client. Applications may use both dd-trace-js alongside a DogStatsD client such as hot-shots. Rather than have to configure hot-shots to use UDS, and to also configure dd-trace-js internal DogStatsD client to use UDS, it would be nice if we could just do:

DogStatsD = require('hot-shots');
dogstatsd = new DogStatsD({ protocol: 'uds' });
tracer = require('dd-trace-js').init({ dogstatsd });

maxenglander avatar Jun 22 '20 15:06 maxenglander

DogStatsD is an implementation detail of the tracer, so unfortunately it's not really possible to accept a custom client. This is especially true since we plan to switch from DogStatsD to using the existing endpoint on the agent. The good news is that this will not only improve the onboarding experience but it will also mean that UDS will just work.

rochdev avatar Jun 22 '20 16:06 rochdev

Thanks @rochdev! If there are plans to switch from DogStatsD to the existing agent endpoint, should this ticket just be closed? Or is there any work that could be done with this ticket in the mean time?

maxenglander avatar Jun 22 '20 17:06 maxenglander

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

maxenglander avatar Jun 22 '20 18:06 maxenglander

If there are plans to switch from DogStatsD to the existing agent endpoint, should this ticket just be closed?

I think it should stay open since the issue is not solved.

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

That makes sense. Since there is already an option for the port, adding an option for the host as well would definitely work as a workaround until we get to the endpoint change.

rochdev avatar Jun 22 '20 19:06 rochdev

Thanks @rochdev I'll submit a PR to make specifying DogStatsD host and option.

Do you have a rough idea for timeline for being able to send APM metrics through trace endpoint? Is that something that would/could be handled just with changes to dd-trace-js and also changes to the Golang datadog-agent, or does it also require changes to DataDog backend components?

maxenglander avatar Jun 22 '20 19:06 maxenglander

Do you have a rough idea for timeline for being able to send APM metrics through trace endpoint?

No timeline right now unfortunately.

Is that something that would/could be handled just with changes to dd-trace-js and also changes to the Golang datadog-agent, or does it also require changes to DataDog backend components?

This should only require changes in the tracer and the agent. We have already started experimenting in branches but nothing production-ready yet.

rochdev avatar Jun 25 '20 15:06 rochdev

I see that there is platform folder with nodejs support for metrics. Would be nice if it was exposed so we can send out custom gauge-s to DD

ghost avatar Jul 30 '20 10:07 ghost

@dusan-dragon We implemented custom support internally to have more control on our specific use case for runtime metrics, but for any other custom metrics you can use any DogStatsD library such as hot-shots.

rochdev avatar Aug 12 '20 20:08 rochdev

I think consumer lag for kafka is pretty standard metric. Currently you have plugin for kafkajs but you do not support consumer lag metric

ghost avatar Mar 30 '22 18:03 ghost

Hi @rochdev - could you please provide an update on this ticket (ie. will any of the current efforts you're working on solve this?).

janeklb avatar Nov 23 '22 12:11 janeklb

At a minimum, it might be good to allow for dogstatsd.host to be an API option for init(), so that users can take advantage of the socat workaround suggested in DD docs.

@maxenglander out of curiosity, how are you running socat / how is it helping you?

janeklb avatar Nov 23 '22 12:11 janeklb

@janeklb If you configure dd-trace to use UDS, it will now also use it for runtime metrics as long as you have a recent enough version of both the library and the agent.

rochdev avatar Nov 23 '22 17:11 rochdev

if you configure dd-trace to use UDS,

I've got dd-trace configured to use UDS for tracing (using DD_TRACE_AGENT_URL=unix:///var/run/datadog/apm.socket), but when I use that I don't get runtime metrics reported. Should they be?. Is there something else that needs to be set? Afaiu statsd datagrams should be sent to the dsd.socket (not apm.socket), but I don't see how that could be configured. The dogstatsd client https://github.com/DataDog/dd-trace-js/blob/master/packages/dd-trace/src/dogstatsd.js#L13-L31 doesn't seem to have any references to UDS.

In this discussion thread https://github.com/DataDog/dd-trace-js/discussions/2470#discussioncomment-3973852 you mentioned

This change allows us to support additional features like reporting metrics over UDS.

But reading through the PR i don't see any mention of UDS either

FWIW the agent is running as a daemonset in an EKS cluster, using the image gcr.io/datadoghq/agent:7-jmx

janeklb avatar Nov 23 '22 22:11 janeklb

but when I use that I don't get runtime metrics reported

Was it working before without UDS? Or is this the first time you setup the feature? It's still necessary to follow the steps in the setup docs for both the library and the agent before runtime metrics can be sent.

But reading through the PR i don't see any mention of UDS either

The PR adds support for reporting runtime metrics through HTTP with the existing HTTP client in the library, which already supports sending HTTP through UDS, so it doesn't explicitly add support for UDS.

FWIW the agent is running as a daemonset in an EKS cluster, using the image gcr.io/datadoghq/agent:7-jmx

Can you confirm the exact version that is installed? It should be at a minimum 7.40.

rochdev avatar Nov 23 '22 23:11 rochdev

Was it working before without UDS?

Yes, it was working fine before (we had DD_AGENT_HOST configured correctly, and the agent was listening on 8125/udp and 8126)

The PR adds support for reporting runtime metrics through HTTP with the existing HTTP client in the library, which already supports sending HTTP through UDS, so it doesn't explicitly add support for UDS.

thanks for clarifying

Can you confirm the exact version that is installed? It should be at a minimum 7.40.

I can't say for sure -- the image hash is 8d4cf9a5bd7e1520170904081279eb5a2bf1bfd1caabb04426e3017c06e45461 - docker hub lists it as 14 days old.. so probably

janeklb avatar Nov 24 '22 00:11 janeklb

It may be worth noting that we have hot-shots working using protocol: "uds" (though we are occasionally seeing the "congestion" error).

Also, the agent's statsd container is reporting:

datadog-q8mmp agent 2022-11-24 00:03:48 UTC | CORE | WARN | (pkg/dogstatsd/listeners/uds_common.go:211 in Listen) | dogstatsd-uds: error processing origin, data will not be tagged : matched PID for the process is 0, it belongs probably to another namespace. Is the agent in host PID mode?

so we may need to set hostPID: true (as per this guide), but that seems like a different issue

janeklb avatar Nov 24 '22 00:11 janeklb

@janeklb Looks like there was a bug in the agent which should be fixed in the next release. I believe it should also be in the latest release candidate, which as of now is 7.41.0-rc.5

rochdev avatar Nov 24 '22 01:11 rochdev

Thanks @rochdev - does that mean my configuration above is sufficient?

janeklb avatar Nov 24 '22 19:11 janeklb

@janeklb Since you said it was working before with UDP, then yes the only change that should be needed (other than the URL pointing to UDS instead of HTTP) is upgrading the agent.

rochdev avatar Nov 24 '22 22:11 rochdev

@janeklb Since you said it was working before with UDP, then yes the only change that should be needed (other than the URL pointing to UDS instead of HTTP) is upgrading the agent.

Thanks - I'll give it a try when the agent is updated

janeklb avatar Nov 25 '22 23:11 janeklb