prometheus
prometheus copied to clipboard
docker_sd: Deduplicate containers attached to multiple networks
Proposal
I have several containers attached to multiple Docker networks (e.g. frontend, backend). At the moment docker_sd creates a scraping target for each container on all networks. As some services are only attached to one network, it's not feasible to use relabelling to drop all containers in a certain network (or only keep those).
It would be nice if docker_sd could be configured to only use the first network, for example or find some other way to avoid duplicate targets.
This is interesting. Can you workaround the issue by taking multiple labels in input_labels and use regexes ? e.g. can you find which container is attached to which network based on the container name?
Are you using docker swarm?
Can you workaround the issue by taking multiple labels in input_labels and use regexes ? e.g. can you find which container is attached to which network based on the container name?
The container network name is stored in __meta_docker_network_name, so I tried setting a label prometheus.network on the container, but AFAIK there's no way to compare __meta_docker_network_name and the resulting __meta_docker_container_label_prometheus_network during relabeling, as the RE2 engine doesn't support back references.
Are you using docker swarm?
No I'm using docker_sd, but I guess this is also applies to docker_swarm_sd.
Indeed, you would need to hardcode them:
- source_labels: [__meta_docker_network_name,__meta_docker_container_label_prometheus_network]
regex: frontend;frontend
target_label: __tmp_docker_keep
replacement: keep
- source_labels: [__meta_docker_network_name,__meta_docker_container_label_prometheus_network]
regex: backend_backend
target_label: __tmp_docker_keep
replacement: keep
- source_labels: [__tmp_docker_keep]
regex: keep
action: keep
or, shorter
- source_labels: [__meta_docker_network_name,__meta_docker_container_label_prometheus_network]
regex: frontend;frontend|backend;backend
action: keep
The issue with the approach you are proposing is that I do not know what are the guarantees from Docker that the networks are ordered in fixed order between API calls, and how it behaves when you attach/detach to networks, then restarts the daemon, etc.
I'm not sure about the behavior of the Docker API either, it would just be nice to have some built-in way to avoid duplicate targets.
okay. I think we could have a networks array where we take a list of networks. For each container, we add its target on that network. If not network is matched, we dismiss the target. Does it sound like a solution?
Ok, I'm not sure I understand you correctly. Given the following container setup (and assuming all containers have Prometheus metric endpoints), how can we make sure the "api" container doesn't end up twice in the target list?
-
container: frontend networks:
frontend -
container: api networks:
frontend,backend -
container: database networks:
backend
We could implement/explore with an alternative mode, like we do for kubernetes, where one target has the addresses over all the networks.
I have tagged this issue as P3.
Currently, for each container we retrieve we loop through each of its networks and furthermore each of its ports, adding a target for each network+port combination.
https://github.com/prometheus/prometheus/blob/9e88c3bb4dd7b01456639df06df07053d5626445/discovery/moby/docker.go#L186-L205
I think we could just provide a configuration setting to break after the first network iteration, allowing target(s) to be added for only one network per container (there could be multiple ports).
If we break, we should order them. We should also have one label for each network address (without port).
If we break, we should order them.
Using what method?
We should also have one label for each network address (without port).
So when we break and only add a target for the first network, we should add extra labels to that target with the other network addresses (like __meta_docker_network_other_frontend=127.0.0.1)
I think Docker, Docker Swarm and Kubernetes SD should ideally behave the same way, as it boils down to "software running containers attached to (multiple) networks". Not saying, that the current Kubernetes implementation is the way to go, but if we come to a decision, it should apply to all three SD methods.
Kubernetes implementation is the way to go, but should be behind a flag if possible.
If we break, we should order them.
Using what method?
We should also have one label for each network address (without port).
So when we break and only add a target for the first network, we should add extra labels to that target with the other network addresses (like
__meta_docker_network_other_frontend=127.0.0.1)
I think it's no need to order multiple networks by default, just get the first network will be ok.
People can order the networks in docker-compose config file.
Kubernetes meet the same promblem when pod has multiple networks and it just shows the first network as default network.
The command look as below:
nsenter -n -t19032 -F -- ip -o -4 addr show dev eth0 scope global
For anyone curious, I solved the network deduplication using the keepequal action in the relabel configs. I also use an extra couple relabel configs to deduplicate the port (if you have a container that publishes multiple ports). I am also able to override the port when the default 80 is not the port I want. With these relabel configs I am able to monitor 13 containers (all of which use a different setup, some with http, some https, some using custom port, some using discovered port, some using a hostname, some using an ip) using just a few simple labels on each docker swarm service and double checking the prometheus UI it is truly only picking up one target on the port and network I expect.
relabel configs with comments:
relabel_configs:
# Only keep containers that should be running.
- source_labels: [__meta_dockerswarm_task_desired_state]
regex: running
action: keep
# Only keep containers that have a `prometheus-port` label.
- source_labels: [__meta_dockerswarm_service_label_prometheus_port]
regex: .+
action: keep
# get the port from this target
- source_labels: [__address__]
regex: '[^\s]+:(\d+)'
target_label: __tmp_found_port
# keep only if this is the port we're looking for
- source_labels: [__meta_dockerswarm_service_label_prometheus_port]
target_label: __tmp_found_port
action: keepequal
# limit to one network
- source_labels: [__meta_dockerswarm_service_label_prometheus_network]
target_label: __meta_dockerswarm_network_name
action: keepequal
# replace address with prometheus-replacement-port if available
- source_labels: [__address__, __meta_dockerswarm_service_label_prometheus_replace_port]
separator: ':'
regex: '([^:]+):[^:]+:(\d+)'
replacement: "${1}:${2}"
target_label: __address__
# override address if an override is provided
- source_labels: [__meta_dockerswarm_service_label_prometheus_address]
regex: '([^\s]+)'
target_label: __address__
# override scheme if an override is provided
- source_labels: [__meta_dockerswarm_service_label_prometheus_scheme]
regex: '([^\s]+)'
target_label: __scheme__
Hello from the bug scrub: please see the #10490 for latest status.