kubernetes-app
kubernetes-app copied to clipboard
Container dashboard showing doubled metrics
Hi!
I have a kubernetes cluster on which I recently installed prometheus, through prometheus operator, following official prometheus guide.
I connected Grafana Kubernetes App to kubernetes/prometheus then I noticed that the metrics exposed at K8s container dashboard for memory and cpu was been showing doubled.
After some debug, I discovered the bellow query for cpu usage:
sum(irate(container_cpu_usage_seconds_total{pod_name=~"my-pod"}[2m])) by (pod_name)
Executing the same query but removing aggregate (container_cpu_usage_seconds_total{pod_name=~"my-pod"}
) I could see that there is an extra item that represents the sum of all containers in POD as in table bellow:
============================== ‣Element ‣Value
container_cpu_usage_seconds_total{container_name="POD",cpu="total",endpoint="http-metrics",id="/kubepods/burstable/pod1bb36063-bcf4-11e8-bba4-00505684c990/da0ebc2deef52414bc2f1cc8cb874956b6c3dc478ea8df0d2eb3106a605ba21c",image="k8s.gcr.io/pause:3.1",instance="10.17.14.237:10255",job="kubelet",name="k8s_POD_my-pod-b54bc7874-bpmhd_pocs_1bb36063-bcf4-11e8-bba4-00505684c990_0",namespace="pocs",pod_name="my-pod-b54bc7874-bpmhd",service="kubelet"} 0.012306132
container_cpu_usage_seconds_total{container_name="my-pod-rev2",cpu="total",endpoint="http-metrics",id="/kubepods/burstable/pod1bb36063-bcf4-11e8-bba4-00505684c990/f712e91d11913f01daa863fd3e5d59179afb40a7f1e8ff67cf8f24874d7fd44c",image="sha256:2f41732fe757bb8dc7739b5a536ea38ed3aa9c4bc8f6c3e3c342e4952b0e63e7",instance="10.17.14.237:10255",job="kubelet",name="k8s_my-pod-rev2_my-pod-b54bc7874-bpmhd_pocs_1bb36063-bcf4-11e8-bba4-00505684c990_1",namespace="pocs",pod_name="my-pod-b54bc7874-bpmhd",service="kubelet"} 107.36438691
container_cpu_usage_seconds_total{cpu="total",endpoint="http-metrics",id="/kubepods/burstable/pod1bb36063-bcf4-11e8-bba4-00505684c990",instance="10.17.14.237:10255",job="kubelet",namespace="pocs",pod_name="my-pod-b54bc7874-bpmhd",service="kubelet"} 145.636093587
So, this justify the doubled metrics.
The extra item with the sum of containers (the 3th item on the above table) don't contains the container_name attribute.
In order to avoid Grafana Kubernetes App showing doubled values, I edited queries on dashboard definition by adding container_name=~".+"
to filter, so the query for cpu_usage now looks as:
sum(irate(container_cpu_usage_seconds_total{pod_name=~"my-pod", container_name=~".+"}[2m])) by (pod_name)
I'm wondering if there is a better way to handle this. Have you ever seen this scenario before?
i have the same problem :(
WE TOOOOOO!
ME TOO!!
AND ME!
AND ME