helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

Duplicate data in Grafana dashboards

Open dorsegal opened this issue 1 year ago • 5 comments

We are using victoria-metrics-k8s-stack helm chart and we have those settings in place

#################################################
###              Service Monitors           #####
#################################################
## Component scraping the kubelets
kubelet:
  enabled: true

  # -- Enable scraping /metrics/cadvisor from kubelet's service
  cadvisor: true
  # -- Enable scraping /metrics/probes from kubelet's service
  probes: true
  # spec for VMNodeScrape crd
  # https://docs.victoriametrics.com/operator/api.html#vmnodescrapespec
  spec:
    scheme: "https"
    honorLabels: true
    interval: "30s"
    scrapeTimeout: "5s"
    tlsConfig:
      insecureSkipVerify: true
      caFile: "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
    bearerTokenFile: "/var/run/secrets/kubernetes.io/serviceaccount/token"
    # drop high cardinality label and useless metrics for cadvisor and kubelet
    metricRelabelConfigs:
      - action: labeldrop
        regex: (uid)
      - action: labeldrop
        regex: (id|name)
      - action: drop
        source_labels: [__name__]
        regex: (rest_client_request_duration_seconds_bucket|rest_client_request_duration_seconds_sum|rest_client_request_duration_seconds_count)
    relabelConfigs:
      - action: labelmap
        regex: __meta_kubernetes_node_label_(.+)
      - sourceLabels: [__metrics_path__]
        targetLabel: metrics_path
      - targetLabel: "job"
        replacement: "kubelet"
    # ignore timestamps of cadvisor's metrics by default
    # more info here https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4697#issuecomment-1656540535
    honorTimestamps: false

Since we enabled cadvisor we have duplicated metrics.

  • machine_cpu_cores
  • machine_memory_bytes
  • Potentially more

This cause wrong graphs in some dashboards that does calculations with those metrics.

If you looke at this dashboard for example charts/victoria-metrics-k8s-stack/templates/grafana/dashboards/k8s-views-global.yaml you can find this expression: "expr": "sum(kube_pod_container_resource_requests{resource=\"cpu\"}) / sum(machine_cpu_cores)", To fix this I suggest adding job label so it will looks like this "expr": "sum(kube_pod_container_resource_requests{resource=\"cpu\"}) / sum(machine_cpu_cores{job=\"kubelet\"})",

Also I noticed that you removed job label node-exporter which will also cause issues incase we have different job collecting metrics from outside Kubernetes with the same metric name

dorsegal avatar Oct 04 '23 07:10 dorsegal