datadog-agent
datadog-agent copied to clipboard
Feature Request: Blacklist Host Tags
It would be great if it was possible to strip off Host tags off of metrics. Tags such as what autoscaling group a metric is coming from is not very valuable and can clutter the tags for a particular metric. Being able to exclude tags based on a regex would enable to strip multiple at a time and allow to strip generated tags.
To give an example, in a cattle-like environment, I don't need host
, internal-hostname
, instance-id
, instance-template
, or created-by
tags on my metrics as these are highly automatically generated and cycled in the runtime environment.
Maybe I need 1 of host
OR instance-id
, but I'm usually not drilling down to that level, especially if I'm more concerned about high level metrics.
We could really use this too. We have high cardinality of tags, but ALL of our metrics are annotated by the various hosts too, multiplying the whole thing enormously.
I believe all the instance-related tags are all added by the DataDog agent itself, right? Based on where the data is sourced from.
:+1: on a possibility to remove host tags!
I'd also appreaciate a possibility to remove the kube_replicaset
tag.
Maybe one could have a generic blacklist in the agent?
hmm, the issue was created in 2019, but I assume this exclude option is still not implemented, right? Or did you guys figure out some workaround by any chance?
I think Metrics Without Limits can work around the whole pricing aspect of this extra cardinality. But the actual remove-tags feature doesn't exist that i know of
We would like to have this feature as well. This should be configurable through AWS integration configuration OR Datadog agent.
I will say that I've tried overwriting these tags to a single dummy value in the datadog.yaml
file:
tags:
- aws:ec2:fleet-id:dummy
But that dummy
value just gets added to the list of tag values for that tag key.
My team could really use this too. Really sad to see this has been requested several years ago but appears to have been neither satisfied nor rejected.
We require tags to be filtered/removed at source(ingest) also. Metrics without limits only removes tags during indexing not ingest.
Our company require this features too, manty unused tags cost is high
Same here - I would appreciate the possibility to create filter mask for tags. If tags cannot be removed - by principle monitoring should not affect infrastructure nor it should impact cost (tags play functional role in some scenarios). Since tagging is money cow for Data Dog, I doubt someone will pick this up :(
I wrote a Ruby script that adds custom tag groups that exclude tags you don't want. You'll want to tailor it to your use case, or use it as inspiration. It takes about 8 hours to run against the Datadog account I work on, which has around 10k metrics. My company set up a Datadog monitor for custom-metrics usage, and we re-run this script whenever it alerts.
I agree with others that Datadog's tooling for managing tags on large numbers of metrics is poor. Having Terraform configuration for thousands of metrics isn't practical, and neither is manually configuring them through the web UI. All we're asking for is a blacklist, which seems a lot easier to implement than many other parts of Datadog's tooling, which I'm generally impressed by.
Following-up on this one. There was a question to not only support it for statsd but also for openmetrics. Also as i see we currently have the tag exclusion for EC2 tags. Do we want to make a generic pkg
to make those exclude tags feature work the same by consuming string slices from the configuration?
I'm happy to implement this if i got a green flag from maintainer team.
tagging @alexb-img and @olivielpeau because you were active on https://github.com/DataDog/datadog-agent/pull/6526
I'm a PM at Datadog and want to chime in here to provide context that we are aware of this feature request and are actively looking for details about the use cases, telemetry pipeline needs and pain-points. We highly encourage customers to reach out to us via our support channel (https://www.datadoghq.com/support/) or your CSM contact about this topic! Thanks!
@LutaoX Let me just say that it's super awesome to hear from a PM in a public setting, because until now that's definitely not been the norm from my experience with requests on this repo; my company had internally started assuming that filing feature requests in Github to go along with our support cases was pointless.
Anyway, I'm pretty sure I've reached out before about this and ended up having a support case where I linked this issue, but I don't want to go dig it back up and cause confusion on the customer support end by necro-ing a 2-3 year old issue. One of the big use cases for us is in Kubernetes, where we don't want the host
tag from the reporting agent to get applied to statsd metrics, because (to make a long story short) internalTrafficPolicy: local
does not actually turn off kube-proxy's load balancing behavior, thereby making it impossible for pod Foo on host A to guarantee that any statsd metrics it sends aren't tagged as potentially coming from other hosts. (I brought it up with the sig-networking at some point but they told me that that's intentional behavior of kube-proxy and that a KEP would be needed to add a new option to actually only send traffic to the local pod.)
Obviously this would be a different part of the codebase, but there's also the matter of EC2 tags where we want some of those tags (such as Env
) but not others (such as weird inventorying tags that the company forces us to add but which we don't want showing up in the DataDog web UI.) I'm not sure if that falls under "blacklist host tags" but it sure feels adjacent to it at the very least, and I'm nearly certain I've also brought this up with customer support–maybe even in the same breath as the matter of the host
tag in Kubernetes.
I'm only commenting about my use cases here to make sure they get a little bump and to let anyone who's in the same boat as I am just reference my comment when they open their support tickets. Like, if one can summarize one's request by saying "please do what this guy on github said" then hey, I saved them time :)
I implemented this for k8s workloads, I guess my implementation could be expanded for host tags also. Given that I added a method removeTags to the taglist.
https://github.com/DataDog/datadog-agent/pull/20161
I'm a PM at Datadog and want to chime in here to provide context that we are aware of this feature request and are actively looking for details about the use cases, telemetry pipeline needs and pain-points. We highly encourage customers to reach out to us via our support channel (https://www.datadoghq.com/support/) or your CSM contact about this topic! Thanks!
I created a support case around the issue not long ago. @LutaoX , I don't know if you can access it, but here's the case: http://help.datadoghq.com/hc/requests/1402253
We require tags to be filtered/removed at source(ingest) also. Metrics without limits only removes tags during indexing not ingest.
And that's exactly the issue. While you can define which tags you want to have indexed with Metrics Without Limits (and that's good as the indexing is really expensive), you cannot define what is ingested. And though the price per ingested metric is much lower than indexed, if you have many hosts with many custom metrics, you still end up with a super high bill.
Pragmatically speaking, is there a workaround other than using something other than Datadog to reduce high counts of custom metrics? More precisely, my reading of this GitHub issue and the problem at hand is that the Datadog agent always adds tags, some of which have high cardinalities like host
and name
, which significantly raises the number of custom metrics and the customer's bill outside of the customer's control. The two ways I am aware of to remove these tags is either to use a custom Datadog agent or not use Datadog altogether, and it would be great to learn of a viable workaround.
IMO ideally the agent would allow for custom tag transformers that receive a tag–value pair from the agent and return a tag–value pair to send to the Datadog service, where returning null
would drop that tag–value. But even a simpler API to drop tags regardless of their values would be a great feature.
@ide never tested this, but there is a Vector proxy service You can deploy between agent and datadog. You'd have to reconfigure agents and build your filtering rules, but anything can become financially viable at a certain threshold.
Hi everyone! I think the concern to blacklist HostTags here is super valid despite Datadog-agent being supported or not but currently I have tested with Vector.Dev to disable some unnecessary tags before it was ingested & indexed to Datadog. Although it is super important for us (and our budget) but it seems this feature may take longer for implementation from the Datadog side, thus I share my implementation here for vis
Our architecture like this
Datadog-Agent --> Vector.Dev ---> Datadog Platform
Step 1: It is important to note that your DD-Agent version must be higher than 7.45.1 To allow some of the environment variables below could be applied
DD_OBSERVABILITY_PIPELINES_WORKER_METRICS_ENABLED - boolean - optional - default: false
## Enables forwarding of metrics to an Observability Pipelines Worker
DD_OBSERVABILITY_PIPELINES_WORKER_METRICS_URL - string - optional - default: ""
## This is the URL of vector.dev service that you need to enter
DD_OBSERVABILITY_PIPELINES_WORKER_LOGS_ENABLED - boolean - optional - default: false
## Enables forwarding of logs to an Observability Pipelines Worker
DD_OBSERVABILITY_PIPELINES_WORKER_LOGS_URL - string - optional - default: ""
DD_OBSERVABILITY_PIPELINES_WORKER_TRACES_ENABLED - boolean - optional - default: false
## Enables forwarding of traces to an Observability Pipelines Worker
DD_OBSERVABILITY_PIPELINES_WORKER_TRACES_URL - string - optional - default: ""
Step 2: Setup vector.dev service
(it can be managed by HELM, so please utilise it). The outcome is you have that service up and running and you have a practical endpoint to receive the data
Step 3: The rule that I used on vector.dev to discard the unnecessary tags
sources:
## This aims to allow your vector.dev can receive the traffic from Datadog-Agent
datadog_agents:
type: datadog_agent
address: 0.0.0.0:8282
multiple_outputs: true
store_api_key: false
## This aims to delete 2 tags pod_phase & namespace, also append image_id tag with the default value themystery
transforms:
drop_one_tag:
type: remap
inputs:
- datadog_agents.metrics
source: |-
del(.tags.pod_phase)
del(.tags.namespace)
.tags.image_id = "themystery"
## Output all of them to the Datadog Platform
sinks:
datadog_metrics:
type: datadog_metrics
inputs:
- drop_one_tag
compression: gzip
default_api_key: ${DD_API_KEY}
site: "datadoghq.eu"
Finally, you can check your metric after it arrives in Datadog Platform, surely the tag pod_phase / namespace will not appear anymore
Hope it helps to rescue everyone (not just short-term but even for a long-term model).
We have this same issue as well, particularly in a Azure AKS environment. In our case, our scenario is, 1 team looks after the AKS infrastructure, so therefore they tag the underlying infrastructure with common tags such as team, service, env as well as some custom tags such as 'costcentre'. We then have multiple application teams that will run their application on top of the provided AKS clusters, those teams too will tag their deployments with the same tags. What we are seeing on logging especially are duplicate tags, coming off the both the host tags as well as application logs. this is becoming a pain point as we back bill customers for their logging usage into Datadog. We've checked, double checked, re-checked all our configuration to make sure none of the settings are enabled in the DD-agent to pull host tags as labels, so should only be apply a customers application tags to its logs and metrics.
More precisely, my reading of this GitHub issue and the problem at hand is that the Datadog agent always adds tags, some of which have high cardinalities like host and name, which significantly raises the number of custom metrics and the customer's bill outside of the customer's control.
For those facing issues with high cardinality due to host tags in Datadog, I’d like to share a solution that worked for me.
In the official Datadog documentation on DogStatsD metrics submission - host tags, there is a somewhat vague but crucial statement: “The submitted host tag overrides any hostname collected by or configured in the Agent.” Leveraging this, I discovered that by including an empty host:
tag to all generated metrics, I successfully eliminated unnecessary host tag that were significantly increasing the cardinality. Now, all custom metrics submitted to Datadog only include the tags that I intend to include.
My breakthrough came from Datadog Agent release 6.6.0, which introduced an enhancement allowing DogStatsD to support the removal of hostnames on events and services checks, similar to metrics, by adding an empty host:
tag.
@lifttocode Thank you for sharing that. Specifying host:
appears to have removed the "host" tag.
Relatedly, I tried doing the same for other tags (other_tag:
) but the behavior is slightly different. The agent will use other_tag:
to override the tag that the agent would have otherwise specified but it still sends other_tag:
to Datadog. The Datadog website UI displays the tag as other_tag
without a value. Trying it out is the easiest way to see the behavior for yourself. It is still useful to be able to override tags that would otherwise increase your ingested or indexed tag counts.