grafana-dashboards-kubernetes icon indicating copy to clipboard operation
grafana-dashboards-kubernetes copied to clipboard

[bug] node dashboard shows no values

Open chuegel opened this issue 1 year ago • 6 comments

Describe the bug

We've deployed kube-prometheus-stack via flux:

flux get helmreleases -n monitoring 
NAME                    REVISION        SUSPENDED       READY   MESSAGE                                                                                                        
kube-prometheus-stack   58.2.2          False           True    Helm upgrade succeeded for release monitoring/kube-prometheus-stack.v6 with chart [email protected]
loki-stack              2.10.2          False           True    Helm install succeeded for release monitoring/loki-stack.v1 with chart [email protected]                   

The Grafana dashboards have been installed with Helm values as described. However, we're not able to see any metrics for the node dashboard despite changing the Helm values:

# File: kube-prometheus-stack-values.yaml
prometheus-node-exporter:
  prometheus:
    monitor:
      relabelings:
      - action: replace
        sourceLabels: [__meta_kubernetes_pod_node_name]
        targetLabel: nodename

How to reproduce?

No response

Expected behavior

No response

Additional context

kubectl get po -n monitoring 
NAME                                                       READY   STATUS    RESTARTS   AGE
kube-prometheus-stack-grafana-75c985bc44-5g7sm             3/3     Running   0          7m34s
kube-prometheus-stack-kube-state-metrics-c4dbc548d-l5tcl   1/1     Running   0          17m
kube-prometheus-stack-operator-7846887766-98vvj            1/1     Running   0          17m
kube-prometheus-stack-prometheus-node-exporter-5x97x       1/1     Running   0          17m
kube-prometheus-stack-prometheus-node-exporter-97dbf       1/1     Running   0          17m
kube-prometheus-stack-prometheus-node-exporter-hz4zf       1/1     Running   0          17m
loki-stack-0                                               1/1     Running   0          17m
loki-stack-promtail-bc95r                                  1/1     Running   0          17m
loki-stack-promtail-fpnh9                                  1/1     Running   0          17m
loki-stack-promtail-z64hg                                  1/1     Running   0          17m
prometheus-kube-prometheus-stack-prometheus-0              2/2     Running   0          17m

chuegel avatar Apr 24 '24 18:04 chuegel

Hi @chuegel, Did you try without the relabelings first ? This should only be used if it doesn't work out of the box. Can you share your Kubernetes distribution, Kubernetes version and the entire values configuration for kube-prometheus-stack ?

dotdc avatar Apr 26 '24 05:04 dotdc

Hi @dotdc , thanks for your reply. Yes, I tried that first but the metrics didn't show up.

grafik

kubectl version -o yaml
clientVersion:
  buildDate: "2023-06-14T09:53:42Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3
  goVersion: go1.20.5
  major: "1"
  minor: "27"
  platform: linux/amd64
kustomizeVersion: v5.0.1
serverVersion:
  buildDate: "2023-06-14T22:02:13Z"
  compiler: gc
  gitCommit: 25b4e43193bcda6c7328a6d147b1fb73a33f1598
  gitTreeState: clean
  gitVersion: v1.27.3+rke2r1
  goVersion: go1.20.5 X:boringcrypto
  major: "1"
  minor: "27"
  platform: linux/amd64

Here the value configuration:

apiVersion: helm.toolkit.fluxcd.io/v2beta2
kind: HelmRelease
metadata:
  name: kube-prometheus-stack
spec:
  interval: 1h
  chart:
    spec:
      version: "58.x"
      chart: kube-prometheus-stack
      sourceRef:
        kind: HelmRepository
        name: prometheus-community
      interval: 1h
  install:
    crds: Create
  upgrade:
    crds: CreateReplace
  driftDetection:
    mode: enabled
    ignore:
      # Ignore "validated" annotation which is not inserted during install
      - paths: [ "/metadata/annotations/prometheus-operator-validated" ]
        target:
          kind: PrometheusRule
  valuesFrom:
  - kind: ConfigMap
    name: flux-kube-state-metrics-config
    valuesKey: kube-state-metrics-config.yaml
  # https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml
  values:
    alertmanager:
      enabled: false
    prometheus-node-exporter:
      prometheus:
        monitor:
          relabelings:
          - action: replace
            sourceLabels: [__meta_kubernetes_pod_node_name]
            targetLabel: nodename
    prometheus:
      service:
        labels:
          operated-prometheus: "true"
      prometheusSpec:
        retention: 24h
        resources:
          requests:
            cpu: 200m
            memory: 200Mi
        podMonitorNamespaceSelector: { }
        podMonitorSelector:
          matchLabels:
            app.kubernetes.io/component: monitoring
    grafana:
      defaultDashboardsEnabled: false
      adminPassword: flux
      dashboardProviders:
        dashboardproviders.yaml:
          apiVersion: 1
          providers:
          - name: 'grafana-dashboards-kubernetes'
            orgId: 1
            folder: 'Kubernetes'
            type: file
            disableDeletion: true
            editable: true
            options:
              path: /var/lib/grafana/dashboards/grafana-dashboards-kubernetes
      dashboards:
        grafana-dashboards-kubernetes:
         k8s-system-api-server:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-system-api-server.json
           token: ''
         k8s-system-coredns:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-system-coredns.json
           token: ''
         k8s-views-global:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-global.json
           token: ''
         k8s-views-namespaces:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-namespaces.json
           token: ''
         k8s-views-nodes:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-nodes.json
           token: ''
         k8s-views-pods:
           url: https://raw.githubusercontent.com/dotdc/grafana-dashboards-kubernetes/master/dashboards/k8s-views-pods.json
           token: ''

chuegel avatar Apr 26 '24 07:04 chuegel

All the other dashboards are working correct ?

I think you can remove :

    prometheus-node-exporter:
      prometheus:
        monitor:
          relabelings:
          - action: replace
            sourceLabels: [__meta_kubernetes_pod_node_name]
            targetLabel: nodename

Also, can you try to remove podMonitorNamespaceSelector and podMonitorSelector in prometheusSpec and replace them with :

            podMonitorSelectorNilUsesHelmValues: false
            serviceMonitorSelectorNilUsesHelmValues: false

Memory requests for Prometheus might be a bit low, do you have any pod restart or errors?

If nothing works, can you check what is the output of each variable from the node dashboard ?

dotdc avatar Apr 26 '24 11:04 dotdc

All the other dashboards are working correct ?

All, except Nodes, Pods..there are some values in Namespaces and some in Global also missing:

I think you can remove :

    prometheus-node-exporter:
      prometheus:
        monitor:
          relabelings:
          - action: replace
            sourceLabels: [__meta_kubernetes_pod_node_name]
            targetLabel: nodename

done

Also, can you try to remove podMonitorNamespaceSelector and podMonitorSelector in prometheusSpec and replace them with :

            podMonitorSelectorNilUsesHelmValues: false
            serviceMonitorSelectorNilUsesHelmValues: false

Changed that but no difference

Memory requests for Prometheus might be a bit low, do you have any pod restart or errors?

Pods are fine no restarts

If nothing works, can you check what is the output of each variable from the node dashboard ?

For example in the Node dashboard the metric kube_node_info doesn't seem to return any values:

grafik

chuegel avatar Apr 27 '24 05:04 chuegel

I fixed this by renaming label filter nodename to node:

telegram-cloud-photo-size-2-5314476959251683333-y

wigarddev avatar May 06 '24 03:05 wigarddev

@wigarddev

when I do that I get:

grafik

chuegel avatar May 06 '24 12:05 chuegel

I removed all CRDs and re-deployed the stack again...now the metrics are visible. Thanks

chuegel avatar May 06 '24 18:05 chuegel