alloy
alloy copied to clipboard
__agent_hostname should be available as a meta label for prometheus scrape configs
It would be useful if __agent_hostname was available as a meta label for prometheus scrape configs.
Our specific case is to run the agent on each node, with a static scrape config and a local endpoint. For example, our configuration might look like this:
prometheus:
configs:
- name: agent
host_filter: false
scrape_configs:
- job_name: example
metrics_path: '/metrics'
static_configs:
- targets: ['127.0.0.1:8080']
The labels for this might look something like:
"labels": {
"instance": "127.0.0.1:8080",
"job": "example"
},
"discovered_labels": {
"__address__": "127.0.0.1:8080",
"__metrics_path__": "/metrics",
"__scheme__": "http",
"job": "example"
},
This series will not be unique across multiple instances. Setting the target to ${HOSTNAME}:8080 requires enabling variable interpolation, and may resolve to to a network interface rather than 127.0.0.1
The solution I've found is to set the instance label manually using interpolation:
static_configs:
- targets: ['127.0.0.1:8080']
labels:
instance: "${HOSTNAME}:8080"
However, this requires enabling variable interpolation, and seems to be somewhat less reliable than the mechanism used to set the instance label on the agent integrations.
Ideally, agent_hostname would be available as a meta label globally, so that I could use it in a relabel config wherever I would like.
Hey! This is definitely possible for us to do, but it would require a fair amount of work so I'm wondering if we can discuss the workaround a little bit.
Environment variable expansion is what I've been recommending to people so far. Is it possible for you to set another environment variable with a value you know to be correct so you can have more confidence in the expansion?
I'm using this relabel config for the time being:
relabel_configs:
- source_labels: ["__address__"]
target_label: "instance"
regex: "127.0.0.1:([0-9]+)$"
replacement: "${HOSTNAME:-127.0.0.1}:$1"
I'm attempting to provide a reusable configuration for other groups in our organisation. So, we don't necessarily have full control over our environment.
I could probably find a way to ensure that hostname (or a similar variable is set), but it does create a bit of additional complexity and some fragility for our deployments.
I'd also prefer to have something that doesn't rely on variable expansion in the configuration file. I would prefer not to have variable expansion enabled if this is our only use-case.
The replace_instance_label configuration setting for the integrations works great, and I'd like to have that for static configs.
Our primary use case is to deploy the agent to scrape local services on EC2 instances, and deployments using the agent as a sidecar. In both those cases, we want a robust method to ensure that the instance labels are unique.
An alternative might be a "localhost" service discovery mechanism that provides a bunch of meta labels about the host (or kube pod), but otherwise behaves like a static config with localhost as the target.
Having agent_hostname available on scrape configs would be nice to help unify metrics. I tried to find some way to achieve what you are wanting without code changes or the environment variable. Unfortunately, I did not find anything. We are considering options to achieve more consistency.
To add to what Matt said, we'll share the options we find with you in case you want to be involved. In the meantime, my understanding of the workarounds are:
- Use environment variable expansion
- Use some kind of preprocessing of config files per machine and output the final YAML file for Grafana Agent to read.
I've thought about this more, and personally I think this proposal makes sense and is a good generic candidate to replace host_filter: true. I'm going to raise this with other maintainers to get a consensus going.
We've been discussing this upstream to try to find a generic way to allow both Prometheus and Grafana Agent users to take advantage of this.
This issue has been automatically marked as stale because it has not had any activity in the past 30 days. The next time this stale check runs, the stale label will be removed if there is new activity. The issue will be closed in 7 days if there is no new activity. Thank you for your contributions!
Is this still under consideration?
Yes, but it's not something we've been actively looking into. I still think an upstream change is the place to do this, but there hasn't been a lot of movement on that upstream issue I opened recently.
Any news? Seems like this topic was forgotten, but still would be useful to have such label
Its available in flow mode via os.hostname, unlikely to be backported to static mode.