kube-prometheus node_namespace_pod_container:container_cpu_usage_seconds_total:sum

trafficstars

record: node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate
expr: sum by(cluster, namespace, pod, container) (rate(container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor"}[5m])) * on(cluster, namespace, pod) group_left(node) topk by(cluster, namespace, pod) (1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""}))

this record rule is not work in promethues ,if i change on(cluster, namespace, pod) is on( namespace, pod),it works

Prometheus Operator version: release-0.6
Kubernetes version information:

kubectl version Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:11:25Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.14", GitCommit:"89182bdd065fbcaffefec691908a739d161efc03", GitTreeState:"clean", BuildDate:"2020-12-18T12:02:35Z", GoVersion:"go1.13.15", Compiler:"gc", Platform:"linux/amd64"}

Mar 24 '21 09:03 xiaozhangzhang1

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

Mar 24 '21 09:03 paulfantom

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

i did, return no data

Mar 24 '21 09:03 xiaozhangzhang1

It looks like you have a cluster label in one metric but not in the other. Does container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} and kube_pod_info{node!="",cluster!=""} return any output?

kube_pod_info{node!="",cluster!=""} return many kube_pod_info{cluster="",container="kube-rbac-proxy-main",created_by_kind="",created_by_name="",host_ip="",instance="",job="kube-state-metrics",namespace="default",node="master01",pod="netshoot",pod_ip="",uid="ef6d61ac-fed4-4ee3-b757-de912a6863fb"}

Mar 24 '21 09:03 xiaozhangzhang1

container_cpu_usage_seconds_total{container!="POD",image!="",job="kubelet",metrics_path="/metrics/cadvisor", cluster!=""} return no data

Mar 24 '21 09:03 xiaozhangzhang1

It seems that your cluster is not configured correctly and you have cluster label attached to metrics from kube-state-metrics, but not to metrics from kubelet. You need to have it in both places.

Mar 24 '21 09:03 paulfantom

It seems that your cluster is not configured correctly and you have cluster label attached to metrics from kube-state-metrics, but not to metrics from kubelet. You need to have it in both places.

thanks ,i got it ,yes ,i did cluster label to kube-state-metrics, i did the same to kubelet, but it not works ,

Mar 24 '21 09:03 xiaozhangzhang1

I have the same issue where CPU usage is not working anymore on grafana

Jul 20 '21 17:07 ArchiFleKs

Hi,

I'm not sure but I think this issue was fixed in :

https://github.com/prometheus-operator/kube-prometheus/commit/78a467737064513e64eeb2b1df60a2478cfeb23c

It seems that the record node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate was replaced by node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate

Oct 05 '21 17:10 rouja

I am also facing same issue

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests{job="kube-state-metrics", cluster="$cluster", namespace="$namespace", resource="cpu"})

Does not return data

Nov 16 '22 09:11 SonalJain1707

+1 the same thing

I am also facing same issue

sum(node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate{cluster="$cluster", namespace="$namespace"}) / sum(kube_pod_container_resource_requests{job="kube-state-metrics", cluster="$cluster", namespace="$namespace", resource="cpu"})

Does not return data

Dec 12 '22 13:12 rmn-lux

When querying node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate on my prometheus instance, that does not return anything. From what exporter this metric come from ?

Dec 22 '22 09:12 jeremydescamps

Hi there,

Prometheus stack chart : kube-prometheus-stack-43.1.1, App Version: 0.61.1 K8s deployed with Rancher Docker version 2.7 : 1.24.4

To me, it was related to this issue : https://github.com/k3s-io/k3s/issues/5782 As mentioned in the issue, image label is now missing.

Workaround : I removed the image!="" label in all the rules from prometheus-stack-kube-prom-k8s.rules.yaml file and now my grafana dashboard work like a charm

# Extract from file : prometheus-stack-kube-prom-k8s.rules.yaml
- name: k8s.rules
      rules:
        - expr: >-
            sum by (cluster, namespace, pod, container) (
              irate(container_cpu_usage_seconds_total{job="kubelet", metrics_path="/metrics/cadvisor", image!=""}[5m])
            ) * on (cluster, namespace, pod) group_left(node) topk by (cluster,
            namespace, pod) (
              1, max by(cluster, namespace, pod, node) (kube_pod_info{node!=""})
            )
          record: >-
            node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate

This file has to be modified as well : prometheus-stack-kube-prom-k8s-resources-workload.yaml Remove : container!="" and image!="" Don't forget to kill pod : prometheus-stack-grafana so dashboards get updated !

Dec 22 '22 09:12 fguiet

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

Feb 21 '23 03:02 github-actions[bot]

Hi @fguiet Got stuck in this problem as well, i am using latest helm chart version. I am using minikube v1.28.0.

I've already removed the label image!="" from k8s.rules and cpu dashboards started working.

However, i still have issues for memory dashboard which basically use metric container_memory_working_set_bytes.

Example: sum(container_memory_working_set_bytes{job="kubelet", metrics_path="/metrics/cadvisor", cluster="$cluster", namespace="$namespace", pod="$pod", container!="", image!=""}) by (container)

From prometheus, there is no cluster, container or image labels for this metric. Did you face this issue as well and if yes, how did you fix it?

Similar issue with dashboards using fs metrics (and probably a lot of other metrics from cadvisor): sum by(container) (rate(container_fs_reads_total{job="kubelet", metrics_path="/metrics/cadvisor", device=~"(/dev/)?(mmcblk.p.+|nvme.+|rbd.+|sd.+|vd.+|xvd.+|dm-.+|md.+|dasd.+)", container!="", cluster="$cluster", namespace="$namespace", pod="$pod"}[$__rate_interval]))

Thanks

Mar 17 '23 18:03 bmgante

FWIW this has been addressed in later versions of Rancher Server 2.6.11 along upgrading k8s to 1.24.10-rancher4-1

Mar 27 '23 13:03 sirajkrm

When querying node_namespace_pod_container:container_cpu_usage_seconds_total:sum_irate on my prometheus instance, that does not return anything. From what exporter this metric come from ?

Same for me, did you found it finally?

Using the last helm chart of prometheus stack

May 25 '23 07:05 anthosz

I'm hitting this as well. any ideas?

Jul 12 '23 22:07 zuchka

same issue here

Jul 20 '23 15:07 jpiazza35

if you have included this in the values of prometheus :

before: kubelet: serviceMonitor: https: false

After (This works): kubelet: serviceMonitor: https: true because Kubelet is responsible for that metrics.

for me I have disabled http in service-monitor for kubelt then I research it and foud that kublelt hhtp shoulbe enabled that is http:true

Aug 01 '23 08:08 DhruvPatel2647

I'm facing this same issue, reported here.

Removed container!="" and image!="" from prometheus-stack-kube-prom-k8s.rules.yaml worked.

Sep 06 '23 14:09 gustavofbreunig

Thanks to @gustavofbreunig.

Removing all image!="" from charts/kube-prometheus-stack/templates/prometheus/rules-1.14/k8s.rules.yaml file, fixed my problem too.

This is my fork if any one wants to check it.

Sep 08 '23 11:09 mohamadkhani

Related issue: https://github.com/google/cadvisor/issues/3336

Sep 08 '23 14:09 gustavofbreunig

It was a rancher issue, corrected on v1.24.10-rancher4-1

https://github.com/rancher/rancher/issues/38934

proceed to close the issue

Sep 08 '23 15:09 gustavofbreunig

Related issue: google/cadvisor#3336

Also related details if you are on Docker-Desktop: https://github.com/docker/for-mac/issues/6969

Edit: ....or potentially just using the docker driver for minikube or Docker Desktop or really anything that involves Docker.

Sep 08 '23 23:09 nathanmcgarvey-modopayments

This issue has been automatically marked as stale because it has not had any activity in the last 60 days. Thank you for your contributions.

Nov 08 '23 03:11 github-actions[bot]

This issue was closed because it has not had any activity in the last 120 days. Please reopen if you feel this is still valid.

Mar 08 '24 03:03 github-actions[bot]

kube-prometheus kube-prometheus copied to clipboard

node_namespace_pod_container:container_cpu_usage_seconds_total:sum_rate

kube-prometheus
kube-prometheus copied to clipboard