telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

add support for datadog protocol v1.2

Open willkg opened this issue 1 year ago • 2 comments

The Datadog Python library 0.45.0 in a k8s environment adds a container id to the metrics being emitted.

With datadog_extensions: true, Telegraf understands tags part, but doesn't understand the container id part and spits out this error:

2023-03-27T19:53:44Z E! [inputs.statsd] Splitting '|', unable to parse metric: eliot.diskcache.usage:0|g|c:f1ce8f072aab3dbe86e5dd91dcf83ad7a2a9147f26c45f5549eddd1a6608f763

The container id is part of datadog protocol v1.2 which has a line format like this:

<METRIC_NAME>:<VALUE>|<TYPE>|#<TAG_KEY_1>:<TAG_VALUE_1>,<TAG_2>|c:<CONTAINER_ID>

https://docs.datadoghq.com/developers/dogstatsd/datagram_shell/?tab=metrics#dogstatsd-protocol-v12

It'd be great if telegraf could add support for parsing the container id in datadog protocol v1.2.

willkg avatar Mar 30 '23 16:03 willkg

Hi,

Can I please ask that you use the bug template or feature request template next time to a) show a config and b) show the full list of messages that come out so we have a quick way to reproduce. Thanks!

Steps to reproduce:

[[inputs.statsd]]
  protocol = "udp"
[[outputs.file]]
echo "eliot.diskcache.usage:0|g|c:f1ce8f072aab3dbe86e5dd91dcf83ad7a2a9147f26c45f5549eddd1a6608f763" | nc -C -w 1 -u localhost 8125
2023-03-31T21:20:12Z D! [inputs.statsd] Sample rate must be in format like: @0.1, @0.5, etc. Ignoring sample rate for line: eliot.diskcache.usage:0|g|c:f1ce8f072aab3dbe86e5dd91dcf83ad7a2a9147f26c45f5549eddd1a6608f763
f1ce8f072aab3dbe86e5dd91dcf83ad7a2a9147f26c45f5549eddd1a6608f763
2023-03-31T21:20:12Z E! [inputs.statsd] Splitting '|', unable to parse metric: eliot.diskcache.usage:0|g|c:f1ce8f072aab3dbe86e5dd91dcf83ad7a2a9147f26c45f5549eddd1a6608f763

It looks like this is taking place in parseStatsdLine. The extra data kicks off a logic branch to read extra fields as sample rate currently.

In addition to handling the actual container ID, we would also need to re-work how caching and metric building is handled. Right now there is assumption of a single field which is used to set the metric type. The new field for container id would need to be added and handled.

I am also not sure what work is required for events and if different paths are taken when the datadog specific settings are enabled.

powersj avatar Mar 31 '23 22:03 powersj

I think it would be a nice improvement if even it just ignored this container field (rather than modeling it) so that it did not choke on these metrics.

hozn avatar Apr 21 '24 02:04 hozn