when discovered labels has conflicts, for example `__meta_kubernetes_pod_container_name`, we should be drop them instead of preserve the last one.
Proposal
When k8s discovery is used, and the scraper finds a pod without containerPort field, it will created multiple targets which only differ in __meta_kubernetes_pod_container_name.
Then, when syncing those targets, they will be considered as duplicated. And after de-duplicated, the metadata of last one will be preserved. So, the value of the name of __meta_kubernetes_pod_container_name will be the last container in pod (init container when exists), which is wrong in many scenarios.
When no containerPort presents, we don't know the container name actually. So we should drop __meta_kubernetes_pod_container_name and keep it unspecified.
This conclusion may also apply to other similar discovered labels conflict scenarios.
I thought containerPort was a required field, if one wants to define a port.
After quickly looking over the code, I see the empty container.Ports case is handled in many places, assuming as I mentioned, that containerPort is always present.
Are you able to re/produce this? We'll need more details, the spec of that Pod, the prom config...
Hello from the bug-scrub!
@h0cheung this sounds like something we would want to fix, but we need more detail on what exactly is going wrong. Would you be able to provide this detail?
Hello from the bug-scrub!
@h0cheung this sounds like something we would want to fix, but we need more detail on what exactly is going wrong. Would you be able to provide this detail?
For example, there are some pod, each has several containers without containerPort defined in spec. They have a port 12345 that should be scraped.
Then we may use a config like:
kubernetes_sd_configs:
- role: pod
selectors:
field: # a selector for these pods
# common configurations...
relabel_configs:
- source_labels: [__address__]
target_label: __address__
regex: ([^:]+)(?::\d+)?
replacement: $$1:12345
action: replace
to specify the port. This way, there may be some targets like this (assume that the pod ip is 1.2.3.4):
{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container1", ...}
{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container2", ...}
{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container3", ...}
As they have the same address, they will be de-duplicated, and only the last one will be preserved, so the final output will be {__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container3", ...}
Maybe we should drop the __meta_kubernetes_pod_container_name when de-duplicating, because it has multiple values.
Thanks for the details.
Could you elaborate on why you cannot/don't want to define ports.containerPort? Although it's just informational, we rely on it (its concreteness) for discovery.
Also I don't see how you're able to have multiple containers listen on the same port in the same Pod, if it's possible, if shouldn't be that common of a use case.
The init container does not need ports.containerPort, but it is choosen as the container name.
It is very difficult to figure this out without full details.
Here is a full example:
minikube start
helm --kube-context minikube install prometheus oci://ghcr.io/prometheus-community/charts/prometheus
kubectl --context minikube apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app.kubernetes.io/name: MyApp
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
containers:
- name: myapp-container
image: quay.io/brancz/prometheus-example-app:v0.5.0
ports:
- containerPort: 8080
initContainers:
- name: init-myservice
image: busybox:1.28
command: ['sh', '-c', "echo done"]
EOF
kubectl --context minikube --namespace default port-forward $(kubectl --context minikube get pods --namespace default -l "app.kubernetes.io/name=prometheus,app.kubernetes.io/instance=prometheus" -o name) 9090:9090
As you can see, I have a container which will serve the metrics with a container port.
Here are the details of the running containers:
kubectl --context minikube get pod myapp-pod -o yaml | grep -e containerStatuses: -e initContainerStatuses: -e containerID: -e image: -e imageID: -e name: | grep -A4 Statuses:
containerStatuses:
- containerID: docker://1d45b87d72bf0ca9ee5a32714bd07e1ac4c3726159220368897b3519fef05026
image: quay.io/brancz/prometheus-example-app:v0.5.0
imageID: docker-pullable://quay.io/brancz/prometheus-example-app@sha256:10025acb391cbbc23e0db3d041df02edae53f7c1723fdf485e69d43d3ce2cef9
name: myapp-container
--
initContainerStatuses:
- containerID: docker://b50bbc68db496b62799e829293532c5f61adcf6d53434f80cef02d696136bc57
image: busybox:1.28
imageID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
name: init-myservice
But here are the discovery labels:
curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.labels.pod=="myapp-pod").discoveredLabels' | grep __meta_kubernetes_pod_container
"__meta_kubernetes_pod_container_id": "docker://b50bbc68db496b62799e829293532c5f61adcf6d53434f80cef02d696136bc57",
"__meta_kubernetes_pod_container_image": "busybox:1.28",
"__meta_kubernetes_pod_container_init": "true",
"__meta_kubernetes_pod_container_name": "init-myservice",
The labels are not correct.
I honestly still don’t see where the problem is.
the TestPodDiscoveryInitContainer test in https://github.com/prometheus/prometheus/blob/1b0f2b3017b0231afdbbf752971b363c5f91d135/discovery/kubernetes/pod_test.go#L306 shows that in this case both targets/containers are discovered, and the test does pass.
You could try tweaking that test to see if it reproduces your issue.
Hello,
I think that the problem is likely elsewhere. Why do my exemple shows the init container ? Where do you select the init container as the labels source instead of myapp-container which has a valid container port ?
Some additional context: myapp-container is not found in the targets. ONly the init container remains.
- containers:
kubectl --context minikube get pod myapp-pod -o yaml | grep -e containerStatuses: -e initContainerStatuses: -e containerID: -e image: -e name: | grep -A3 Statuses:
containerStatuses:
- containerID: docker://b52dc3295c57d783205d6111d5447703d3149b6e26d0dbd2c47c47a79b19d50f
image: quay.io/brancz/prometheus-example-app:v0.5.0
name: myapp-container
--
initContainerStatuses:
- containerID: docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95
image: busybox:1.28
name: init-myservice
- active targets
curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.scrapePool=="kubernetes-pods")|select(.labels.pod=="myapp-pod")' | grep __meta_kubernetes_pod_container
"__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
"__meta_kubernetes_pod_container_image": "busybox:1.28",
"__meta_kubernetes_pod_container_init": "true",
"__meta_kubernetes_pod_container_name": "init-myservice",
- dropped target for this pod is empty
curl -s localhost:9090/api/v1/targets | jq '.data.droppedTargets[]|select(.scrapePool=="kubernetes-pods")|select(.labels.pod=="myapp-pod")'
The scrape config for this job (this is the default config of the helm chart)
kubectl --context minikube get configmaps prometheus-server -o yaml | yq '.data."prometheus.yml"' | yq '.scrape_configs[]|select(.job_name=="kubernetes-pods")'
honor_labels: true
job_name: kubernetes-pods
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape
- action: drop
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
replacement: '[$2]:$1'
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: replace
regex: (\d+);((([0-9]+?)(\.|$)){4})
replacement: $2:$1
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: drop
regex: Pending|Succeeded|Failed|Completed
source_labels:
- __meta_kubernetes_pod_phase
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node
And another strange behaviour: for another job (kubernetes-pods-slow), we find the container as a dropped target.
Why myapp-container disappered from the scrape pool kubernetes-pods ?
curl -s localhost:9090/api/v1/targets | jq '.data.droppedTargets[]|select(.discoveredLabels."__meta_kubernetes_pod_label_app_kubernetes_io_name"=="MyApp")' | grep -e __meta_kubernetes_pod_container -e scrapePool
"__meta_kubernetes_pod_container_id": "docker://b52dc3295c57d783205d6111d5447703d3149b6e26d0dbd2c47c47a79b19d50f",
"__meta_kubernetes_pod_container_image": "quay.io/brancz/prometheus-example-app:v0.5.0",
"__meta_kubernetes_pod_container_init": "false",
"__meta_kubernetes_pod_container_name": "myapp-container",
"__meta_kubernetes_pod_container_port_number": "8080",
"__meta_kubernetes_pod_container_port_protocol": "TCP",
"scrapePool": "kubernetes-pods-slow"
"__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
"__meta_kubernetes_pod_container_image": "busybox:1.28",
"__meta_kubernetes_pod_container_init": "true",
"__meta_kubernetes_pod_container_name": "init-myservice",
"scrapePool": "kubernetes-pods-slow"
still no myapp-container in active targets
curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.discoveredLabels."__meta_kubernetes_pod_label_app_kubernetes_io_name"=="MyApp")' | grep -e __meta_kubernetes_pod_container -e scrapePool
"__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
"__meta_kubernetes_pod_container_image": "busybox:1.28",
"__meta_kubernetes_pod_container_init": "true",
"__meta_kubernetes_pod_container_name": "init-myservice",
"scrapePool": "kubernetes-pods",
scrape config for kubernetes-pods-slow
kubectl --context minikube get configmaps prometheus-server -o yaml | yq '.data."prometheus.yml"' | yq '.scrape_configs[]|select(.job_name=="kubernetes-pods-slow")'
honor_labels: true
job_name: kubernetes-pods-slow
kubernetes_sd_configs:
- role: pod
relabel_configs:
- action: keep
regex: true
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
- action: replace
regex: (https?)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_scheme
target_label: __scheme__
- action: replace
regex: (.+)
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_path
target_label: __metrics_path__
- action: replace
regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
replacement: '[$2]:$1'
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: replace
regex: (\d+);((([0-9]+?)(\.|$)){4})
replacement: $2:$1
source_labels:
- __meta_kubernetes_pod_annotation_prometheus_io_port
- __meta_kubernetes_pod_ip
target_label: __address__
- action: labelmap
regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
replacement: __param_$1
- action: labelmap
regex: __meta_kubernetes_pod_label_(.+)
- action: replace
source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- action: replace
source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- action: drop
regex: Pending|Succeeded|Failed|Completed
source_labels:
- __meta_kubernetes_pod_phase
- action: replace
source_labels:
- __meta_kubernetes_pod_node_name
target_label: node
scrape_interval: 5m
scrape_timeout: 30s