telegraf icon indicating copy to clipboard operation
telegraf copied to clipboard

inputs.prometheus still requires cluster level permissions when scoped to a namespace

Open n0coast opened this issue 2 years ago • 5 comments

Relevant telegraf.conf

[[inputs.prometheus]]
      kubernetes_label_selector = "app.kubernetes.io/name=app,app.kubernetes.io/component=web"
      monitor_kubernetes_pods = true
      monitor_kubernetes_pods_method = "settings"
      monitor_kubernetes_pods_namespace = "some-namespace"
      monitor_kubernetes_pods_port = 8000

Logs from Telegraf

gha-telegraf-5fd694d667-c9xz9 telegraf W0907 00:08:07.576872       1 reflector.go:533] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:231: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:some-namepsace:gha-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope
gha-telegraf-5fd694d667-c9xz9 telegraf E0907 00:08:07.576920       1 reflector.go:148] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:231: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:some-namespace:gha-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope

System info

Telegraf 1.27.4, container: docker.io/library/telegraf:1.27-alpine, kubernetes v1.26

Docker

No response

Steps to reproduce

Add the following chart dependency:

   - name: telegraf
     version: 1.8.33
     repository: https://helm.influxdata.com/

Configure telegraf role as follows:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    ...
  labels:
    app.kubernetes.io/instance: gha
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: telegraf
    helm.sh/chart: telegraf-1.8.27
  name: gha-telegraf
  namespace: some-namespace
  ...
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  - services
  verbs:
  - get
  - list
  - watch

Install helm chart with telegraf dependency included, with inputs.prometheus configured and scoped to the namespace application and telegraf are running in.

Expected behavior

Metrics are scraped without issue in the configured namespace.

Actual behavior

telegraf attempts to list namespaces at cluster level and is unable, loudly complains in logs

Additional info

Similar to https://github.com/influxdata/telegraf/issues/12780, but we're using a more recent version of telegraf which should include this fix: https://github.com/influxdata/telegraf/pull/13063

n0coast avatar Sep 07 '23 00:09 n0coast

@Ivaylogi98 or @redbaron are either of you able to run telegraf without cluster level permissions? I know you confirmed in the issue and PR, but @n0coast seems to not be able to.

powersj avatar Sep 11 '23 16:09 powersj

@n0coast , do you have any other instances of inputs.prometheus plugin in the telegraf config?

redbaron avatar Sep 12 '23 07:09 redbaron

Fix you are looking for is https://github.com/influxdata/telegraf/pull/13627 , which I think is in 1.28, but wasn't backported to 1.27

redbaron avatar Sep 12 '23 07:09 redbaron

@n0coast , looking at the error closer, I think it comes from the part where it lists all namespaces. It is for namespace_annotation_pass, but it does it even if this option is not specified like in your config.

This can be improved, I agree

redbaron avatar Sep 12 '23 13:09 redbaron

Still facing the same issue with telegraf:1.29.

Config:

  inputs:
    - prometheus:
        monitor_kubernetes_pods: true
        monitor_kubernetes_pods_method: "annotations"
        monitor_kubernetes_pods_namespace: "thrinadh"
        namepass:
          - "badger_db_size"
        tagdrop:
          type: 
           - "total"

failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:thrinadh:thrinadh-test-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope

Thrinadh-Kumpatla avatar Jan 03 '24 14:01 Thrinadh-Kumpatla

Hello,

Im facing the same issue with 1.29. No matter what I try telegraf still fails because of the cluster scope permissions. Any workaround for this?

valeraBr avatar Feb 20 '24 10:02 valeraBr

@n0coast, @Thrinadh-Kumpatla and @valeraBr please test the binary in PR #14871, available once CI finished all tests... Let me know if this fixes the issue!

srebhan avatar Feb 21 '24 12:02 srebhan

Can someone please test the PR!?!?

srebhan avatar Mar 05 '24 14:03 srebhan

@srebhan I've tested the raised PR, it works now with the provided binaries. No cluster scope errors anymore. Thanks.

valeraBr avatar Mar 06 '24 15:03 valeraBr