prometheus icon indicating copy to clipboard operation
prometheus copied to clipboard

when discovered labels has conflicts, for example `__meta_kubernetes_pod_container_name`, we should be drop them instead of preserve the last one.

Open h0cheung opened this issue 1 year ago • 1 comments

Proposal

When k8s discovery is used, and the scraper finds a pod without containerPort field, it will created multiple targets which only differ in __meta_kubernetes_pod_container_name.

Then, when syncing those targets, they will be considered as duplicated. And after de-duplicated, the metadata of last one will be preserved. So, the value of the name of __meta_kubernetes_pod_container_name will be the last container in pod (init container when exists), which is wrong in many scenarios.

When no containerPort presents, we don't know the container name actually. So we should drop __meta_kubernetes_pod_container_name and keep it unspecified.

This conclusion may also apply to other similar discovered labels conflict scenarios.

h0cheung avatar Oct 16 '24 07:10 h0cheung

I thought containerPort was a required field, if one wants to define a port. After quickly looking over the code, I see the empty container.Ports case is handled in many places, assuming as I mentioned, that containerPort is always present.

Are you able to re/produce this? We'll need more details, the spec of that Pod, the prom config...

machine424 avatar Oct 17 '24 16:10 machine424

Hello from the bug-scrub!

@h0cheung this sounds like something we would want to fix, but we need more detail on what exactly is going wrong. Would you be able to provide this detail?

bboreham avatar Jul 15 '25 11:07 bboreham

Hello from the bug-scrub!

@h0cheung this sounds like something we would want to fix, but we need more detail on what exactly is going wrong. Would you be able to provide this detail?

For example, there are some pod, each has several containers without containerPort defined in spec. They have a port 12345 that should be scraped.

Then we may use a config like:

kubernetes_sd_configs:
  - role: pod
    selectors:
      field: # a selector for these pods
    # common configurations...
    relabel_configs:
      - source_labels: [__address__]
        target_label: __address__
        regex: ([^:]+)(?::\d+)?
        replacement: $$1:12345
        action: replace

to specify the port. This way, there may be some targets like this (assume that the pod ip is 1.2.3.4):

{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container1", ...}
{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container2", ...}
{__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container3", ...}

As they have the same address, they will be de-duplicated, and only the last one will be preserved, so the final output will be {__address__=="1.2.3.4:12345", __meta_kubernetes_pod_name="pod1", __meta_kubernetes_pod_container_name="container3", ...}

Maybe we should drop the __meta_kubernetes_pod_container_name when de-duplicating, because it has multiple values.

h0cheung avatar Jul 16 '25 07:07 h0cheung

Thanks for the details. Could you elaborate on why you cannot/don't want to define ports.containerPort? Although it's just informational, we rely on it (its concreteness) for discovery. Also I don't see how you're able to have multiple containers listen on the same port in the same Pod, if it's possible, if shouldn't be that common of a use case.

machine424 avatar Jul 16 '25 09:07 machine424

The init container does not need ports.containerPort, but it is choosen as the container name.

saez0pub avatar Oct 01 '25 13:10 saez0pub

It is very difficult to figure this out without full details.

bboreham avatar Oct 01 '25 13:10 bboreham

Here is a full example:

minikube start
helm --kube-context minikube install prometheus oci://ghcr.io/prometheus-community/charts/prometheus

kubectl --context minikube apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: myapp-pod
  labels:
    app.kubernetes.io/name: MyApp
  annotations:
    prometheus.io/scrape: "true"
    prometheus.io/port: "8080"
    prometheus.io/path: "/metrics"
spec:
  containers:
    - name: myapp-container
      image: quay.io/brancz/prometheus-example-app:v0.5.0
      ports:
        - containerPort: 8080
  initContainers:
    - name: init-myservice
      image: busybox:1.28
      command: ['sh', '-c', "echo done"]
EOF

kubectl --context minikube --namespace default port-forward $(kubectl --context minikube get pods --namespace default -l "app.kubernetes.io/name=prometheus,app.kubernetes.io/instance=prometheus" -o name) 9090:9090

As you can see, I have a container which will serve the metrics with a container port.

Here are the details of the running containers:

kubectl --context minikube get pod myapp-pod -o yaml | grep -e containerStatuses: -e initContainerStatuses: -e containerID: -e image: -e imageID: -e name: | grep -A4 Statuses:
  containerStatuses:
  - containerID: docker://1d45b87d72bf0ca9ee5a32714bd07e1ac4c3726159220368897b3519fef05026
    image: quay.io/brancz/prometheus-example-app:v0.5.0
    imageID: docker-pullable://quay.io/brancz/prometheus-example-app@sha256:10025acb391cbbc23e0db3d041df02edae53f7c1723fdf485e69d43d3ce2cef9
    name: myapp-container
--
  initContainerStatuses:
  - containerID: docker://b50bbc68db496b62799e829293532c5f61adcf6d53434f80cef02d696136bc57
    image: busybox:1.28
    imageID: docker-pullable://busybox@sha256:141c253bc4c3fd0a201d32dc1f493bcf3fff003b6df416dea4f41046e0f37d47
    name: init-myservice

But here are the discovery labels:

 curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.labels.pod=="myapp-pod").discoveredLabels' | grep __meta_kubernetes_pod_container 
  "__meta_kubernetes_pod_container_id": "docker://b50bbc68db496b62799e829293532c5f61adcf6d53434f80cef02d696136bc57",
  "__meta_kubernetes_pod_container_image": "busybox:1.28",
  "__meta_kubernetes_pod_container_init": "true",
  "__meta_kubernetes_pod_container_name": "init-myservice",

The labels are not correct.

saez0pub avatar Oct 01 '25 15:10 saez0pub

I honestly still don’t see where the problem is.

the TestPodDiscoveryInitContainer test in https://github.com/prometheus/prometheus/blob/1b0f2b3017b0231afdbbf752971b363c5f91d135/discovery/kubernetes/pod_test.go#L306 shows that in this case both targets/containers are discovered, and the test does pass.

You could try tweaking that test to see if it reproduces your issue.

machine424 avatar Oct 01 '25 15:10 machine424

Hello,

I think that the problem is likely elsewhere. Why do my exemple shows the init container ? Where do you select the init container as the labels source instead of myapp-container which has a valid container port ?

saez0pub avatar Oct 01 '25 16:10 saez0pub

Some additional context: myapp-container is not found in the targets. ONly the init container remains.

  • containers:
 kubectl --context minikube get pod myapp-pod -o yaml | grep -e containerStatuses: -e initContainerStatuses: -e containerID: -e image: -e name: | grep -A3 Statuses:       

  containerStatuses:
  - containerID: docker://b52dc3295c57d783205d6111d5447703d3149b6e26d0dbd2c47c47a79b19d50f
    image: quay.io/brancz/prometheus-example-app:v0.5.0
    name: myapp-container
--
  initContainerStatuses:
  - containerID: docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95
    image: busybox:1.28
    name: init-myservice
  • active targets
curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.scrapePool=="kubernetes-pods")|select(.labels.pod=="myapp-pod")' | grep __meta_kubernetes_pod_container
    "__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
    "__meta_kubernetes_pod_container_image": "busybox:1.28",
    "__meta_kubernetes_pod_container_init": "true",
    "__meta_kubernetes_pod_container_name": "init-myservice",
  • dropped target for this pod is empty
curl -s localhost:9090/api/v1/targets | jq '.data.droppedTargets[]|select(.scrapePool=="kubernetes-pods")|select(.labels.pod=="myapp-pod")'


The scrape config for this job (this is the default config of the helm chart)
 kubectl --context minikube get configmaps prometheus-server -o yaml | yq '.data."prometheus.yml"' | yq '.scrape_configs[]|select(.job_name=="kubernetes-pods")'
honor_labels: true
job_name: kubernetes-pods
kubernetes_sd_configs:
  - role: pod
relabel_configs:
  - action: keep
    regex: true
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scrape
  - action: drop
    regex: true
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
  - action: replace
    regex: (https?)
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scheme
    target_label: __scheme__
  - action: replace
    regex: (.+)
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_path
    target_label: __metrics_path__
  - action: replace
    regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
    replacement: '[$2]:$1'
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_port
      - __meta_kubernetes_pod_ip
    target_label: __address__
  - action: replace
    regex: (\d+);((([0-9]+?)(\.|$)){4})
    replacement: $2:$1
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_port
      - __meta_kubernetes_pod_ip
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
    replacement: __param_$1
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - action: replace
    source_labels:
      - __meta_kubernetes_namespace
    target_label: namespace
  - action: replace
    source_labels:
      - __meta_kubernetes_pod_name
    target_label: pod
  - action: drop
    regex: Pending|Succeeded|Failed|Completed
    source_labels:
      - __meta_kubernetes_pod_phase
  - action: replace
    source_labels:
      - __meta_kubernetes_pod_node_name
    target_label: node

saez0pub avatar Oct 02 '25 07:10 saez0pub

And another strange behaviour: for another job (kubernetes-pods-slow), we find the container as a dropped target.

Why myapp-container disappered from the scrape pool kubernetes-pods ?

 curl -s localhost:9090/api/v1/targets | jq '.data.droppedTargets[]|select(.discoveredLabels."__meta_kubernetes_pod_label_app_kubernetes_io_name"=="MyApp")' | grep -e __meta_kubernetes_pod_container -e scrapePool
    "__meta_kubernetes_pod_container_id": "docker://b52dc3295c57d783205d6111d5447703d3149b6e26d0dbd2c47c47a79b19d50f",
    "__meta_kubernetes_pod_container_image": "quay.io/brancz/prometheus-example-app:v0.5.0",
    "__meta_kubernetes_pod_container_init": "false",
    "__meta_kubernetes_pod_container_name": "myapp-container",
    "__meta_kubernetes_pod_container_port_number": "8080",
    "__meta_kubernetes_pod_container_port_protocol": "TCP",
  "scrapePool": "kubernetes-pods-slow"
    "__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
    "__meta_kubernetes_pod_container_image": "busybox:1.28",
    "__meta_kubernetes_pod_container_init": "true",
    "__meta_kubernetes_pod_container_name": "init-myservice",
  "scrapePool": "kubernetes-pods-slow"

still no myapp-container in active targets

 curl -s localhost:9090/api/v1/targets | jq '.data.activeTargets[]|select(.discoveredLabels."__meta_kubernetes_pod_label_app_kubernetes_io_name"=="MyApp")'  | grep -e __meta_kubernetes_pod_container -e scrapePool
    "__meta_kubernetes_pod_container_id": "docker://bd60098b7550881734c8b440d2dfd1e43baec5831f5962d7c32d57302f546a95",
    "__meta_kubernetes_pod_container_image": "busybox:1.28",
    "__meta_kubernetes_pod_container_init": "true",
    "__meta_kubernetes_pod_container_name": "init-myservice",
  "scrapePool": "kubernetes-pods",
scrape config for kubernetes-pods-slow
kubectl --context minikube get configmaps prometheus-server -o yaml | yq '.data."prometheus.yml"' | yq '.scrape_configs[]|select(.job_name=="kubernetes-pods-slow")'
honor_labels: true
job_name: kubernetes-pods-slow
kubernetes_sd_configs:
  - role: pod
relabel_configs:
  - action: keep
    regex: true
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scrape_slow
  - action: replace
    regex: (https?)
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_scheme
    target_label: __scheme__
  - action: replace
    regex: (.+)
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_path
    target_label: __metrics_path__
  - action: replace
    regex: (\d+);(([A-Fa-f0-9]{1,4}::?){1,7}[A-Fa-f0-9]{1,4})
    replacement: '[$2]:$1'
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_port
      - __meta_kubernetes_pod_ip
    target_label: __address__
  - action: replace
    regex: (\d+);((([0-9]+?)(\.|$)){4})
    replacement: $2:$1
    source_labels:
      - __meta_kubernetes_pod_annotation_prometheus_io_port
      - __meta_kubernetes_pod_ip
    target_label: __address__
  - action: labelmap
    regex: __meta_kubernetes_pod_annotation_prometheus_io_param_(.+)
    replacement: __param_$1
  - action: labelmap
    regex: __meta_kubernetes_pod_label_(.+)
  - action: replace
    source_labels:
      - __meta_kubernetes_namespace
    target_label: namespace
  - action: replace
    source_labels:
      - __meta_kubernetes_pod_name
    target_label: pod
  - action: drop
    regex: Pending|Succeeded|Failed|Completed
    source_labels:
      - __meta_kubernetes_pod_phase
  - action: replace
    source_labels:
      - __meta_kubernetes_pod_node_name
    target_label: node
scrape_interval: 5m
scrape_timeout: 30s

saez0pub avatar Oct 02 '25 08:10 saez0pub