helm-charts icon indicating copy to clipboard operation
helm-charts copied to clipboard

[prometheus-kube-stack] ServiceMonitor wrong labels for kube-state-metrics service scraping

Open ff-fgomez opened this issue 2 years ago • 4 comments

Describe the bug a clear and concise description of what the bug is.

The helm chart is producing wrong labels in the ServiceMonitor kube-state-metrics. I spent a reasonable amount of time tracking this bug. I hope it can help someone missing kube-state-metrics being sent to Prometheus.

My solution was to create a custom service monitor. This successfully scrapes the kube-state-metrics.

What's your helm version?

version.BuildInfo{Version:"v3.9.3", GitCommit:"414ff28d4029ae8c8b05d62aa06c7fe3dee2bc58", GitTreeState:"clean", GoVersion:"go1.17.13"}

What's your kubectl version?

Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.4", GitCommit:"b695d79d4f967c403a96986f1750a35eb75e75f1", GitTreeState:"clean", BuildDate:"2021-11-17T15:48:33Z", GoVersion:"go1.16.10", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.7", GitCommit:"42c05a547468804b2053ecf60a3bd15560362fc2", GitTreeState:"clean", BuildDate:"2022-05-24T12:24:41Z", GoVersion:"go1.17.10", Compiler:"gc", Platform:"linux/amd64"}

Which chart?

kube-prometheus-stack

What's the chart version?

39.6.0

What happened?

Helm chart generates this:

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
  labels:
    app.kubernetes.io/component: metrics
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-state-metrics
    app.kubernetes.io/version: 2.5.0
    helm.sh/chart: kube-state-metrics-4.15.0
    release: prometheus-community
  name: prometheus-community-kube-state-metrics
  namespace: monitoring
spec:
  internalTrafficPolicy: Cluster
  ports:
  - name: http
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app.kubernetes.io/instance: prometheus-community
    app.kubernetes.io/name: kube-state-metrics
  sessionAffinity: None
  type: ClusterIP
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    app.kubernetes.io/component: metrics
    app.kubernetes.io/instance: kube-prometheus-stack
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/part-of: kube-state-metrics
    app.kubernetes.io/version: 2.5.0
    helm.sh/chart: kube-state-metrics-4.15.0
    release: prometheus-community
  name: prometheus-community-kube-state-metrics
  namespace: monitoring
spec:
  endpoints:
  - honorLabels: true
    port: http
    scheme: http
  jobLabel: app.kubernetes.io/name
  selector:
    matchLabels:
      app.kubernetes.io/instance: prometheus-community
      app.kubernetes.io/name: kube-state-metrics

The problem is the matchLabels in the ServiceMonitor does not match the Service, superficially this line: app.kubernetes.io/instance:

My custom ServiceMonitor adds the correct matchLabel:

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kube-state-metrics-servicemonitor-custom
  namespace: monitoring
spec:
  jobLabel: app.kubernetes.io/name
  selector:
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus-stack
      app.kubernetes.io/name: kube-state-metrics
  endpoints:
    - port: http
      scheme: http

What you expected to happen?

For all these metrics to be available in prometheus https://github.com/kubernetes/kube-state-metrics/tree/master/docs

In the targets section we should see a valid state of UP with that serviceMonitor. Example: image

How to reproduce it?

Install helmchart with these values:

helmCharts:
  - name: kube-prometheus-stack
    releaseName: prometheus-community
    version: 39.6.0
    repo: https://prometheus-community.github.io/helm-charts
    namespace: monitoring
    valuesFile: values.yaml
    includeCRDs: false

Enter the changed values of values.yaml?

kube-state-metrics:
  namespaceOverride: ""
  rbac:
    create: true
  releaseLabel: true
  prometheus:
    monitor:
      enabled: true

      ## Scrape interval. If not set, the Prometheus default scrape interval is used.
      ##
      interval: ""

      ## Scrape Timeout. If not set, the Prometheus default scrape timeout is used.
      ##
      scrapeTimeout: ""

      ## proxyUrl: URL of a proxy that should be used for scraping.
      ##
      proxyUrl: ""

      # Keep labels from scraped data, overriding server-side labels
      ##
      honorLabels: true

      ## MetricRelabelConfigs to apply to samples after scraping, but before ingestion.
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      metricRelabelings: []
      # - action: keep
      #   regex: 'kube_(daemonset|deployment|pod|namespace|node|statefulset).+'
      #   sourceLabels: [__name__]

      ## RelabelConfigs to apply to samples before scraping
      ## ref: https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#relabelconfig
      ##
      relabelings: []
      # - sourceLabels: [__meta_kubernetes_pod_node_name]
      #   separator: ;
      #   regex: ^(.*)$
      #   targetLabel: nodename
      #   replacement: $1
      #   action: replace

  selfMonitor:
    enabled: false

Enter the command that you execute and failing/misfunctioning.

I just don't see any metrics available in prometheus from the service being scrapped

Anything else we need to know?

No response

ff-fgomez avatar Aug 19 '22 01:08 ff-fgomez

Another workaround for this is to add a patch to the kustomize deployment.

helmCharts:
  - name: kube-prometheus-stack
    releaseName: prometheus-community
    version: 39.6.0
    repo: https://prometheus-community.github.io/helm-charts
    namespace: monitoring
    valuesFile: values.yaml
    includeCRDs: false

patchesStrategicMerge:
  - patch-servicemonitor-kube-state-metrics.yaml

Set patch-servicemonitor-kube-state-metrics.yaml to :

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: prometheus-community-kube-state-metrics
  namespace: monitoring
spec:
  selector:
    matchLabels:
      app.kubernetes.io/instance: kube-prometheus-stack
      app.kubernetes.io/name: kube-state-metrics

ff-fgomez avatar Aug 19 '22 17:08 ff-fgomez

You should be able to override the selector labels using kube-state-metrics.prometheus.monitor.selectorOverride

The kube-state-metrics service monitor sets the selector to those values if given: https://github.com/prometheus-community/helm-charts/blob/kube-state-metrics-4.15.0/charts/kube-state-metrics/templates/servicemonitor.yaml#L16-L20

A bit stumped as to why there is a discrepancy in the app.kubernetes.io/instance: kube-prometheus-stack label and app.kubernetes.io/instance: prometheus-community selector, since they should be using the same values 🤔 (looking at https://github.com/prometheus-community/helm-charts/blob/kube-state-metrics-4.15.0/charts/kube-state-metrics/templates/_helpers.tpl#L64)

gracedo avatar Aug 24 '22 02:08 gracedo

That's a great option. I didn't see that value available. idk why they are generated differently. It's part of the same chart.....

ff-fgomez avatar Aug 26 '22 17:08 ff-fgomez

It seems that in 39.12.1 its already fixed: kubectl -n kube-prometheus describe servicemonitors.monitoring.coreos.com kube-prometheus-stack-kube-state-metrics Selector: Match Labels: app.kubernetes.io/instance: kube-prometheus-stack app.kubernetes.io/name: kube-state-metrics

barzog avatar Sep 12 '22 11:09 barzog

The kube-state-metrics chart seems configurable, but not the kube-proxy service monitor inside the kube-prometheus-stack chart unfortunately.

https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/exporters/kube-proxy/servicemonitor.yaml#L17

I would like to override it with component=kube-proxy

locomoco28 avatar Sep 30 '22 10:09 locomoco28

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

stale[bot] avatar Oct 30 '22 14:10 stale[bot]

This issue is being automatically closed due to inactivity.

stale[bot] avatar Nov 22 '22 23:11 stale[bot]