help request: Why are cluster-wide permission required for some resources such as DaemonSets, Secrets & more?
Describe the issue
I am looking to rollout Fluent Operator in an enterprise Kubernetes cluster to handle various telemetry needs for 100+ services. The configuration related CRDs look to allow our service owners to extend our telemetry pipeline for their own needs, such as additional outputs. That's the main reason Fluent Operator stood out for me.
Security is a concern with full cluster-wide permissions given on DaemonSets, StatefulSets, Secrets, ServiceAccounts, and more. Given that these are namespace scoped what functionality does the operator provide that requires cluster-wide permissions on these resources?
I've attempted to reduce the permission scope on these resources to a single namespace using Roles/RoleBindings, but the operator attempts to list/watch these resources across the entire cluster.
For example, I moved permissions below from the ClusterRole to a namespaced Role (with the appropriate bindings). If the operator exists in a single namespace, as well as either FluentBit or Fluentd, then I would not expect that it needs to monitor the entire cluster for these resources.
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: telemetry
name: fluent-operator
rules:
- apiGroups:
- apps
resources:
- daemonsets
- statefulsets
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterrolebindings
verbs:
- create
- list
- get
- watch
- patch
- apiGroups:
- rbac.authorization.k8s.io
resources:
- clusterroles
verbs:
- create
- list
- get
- watch
- patch
- apiGroups:
- ""
resources:
- secrets
- configmaps
- services
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
The logs below are from the operator once those permissions were scoped to a single namespace.
fluent-operator W0802 20:15:54.123352 1 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.DaemonSet: daemonsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "daemonsets" in API group "apps" at the cluster scope
fluent-operator E0802 20:15:54.123404 1 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.DaemonSet: failed to list *v1.DaemonSet: daemonsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "daemonsets" in API group "apps" at the cluster scope
fluent-operator W0802 20:16:07.306561 1 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "services" in API group "" at the cluster scope
fluent-operator E0802 20:16:07.306618 1 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.Service: failed to list *v1.Service: services is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "services" in API group "" at the cluster scope
fluent-operator W0802 20:16:10.656780 1 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "secrets" in API group "" at the cluster scope
fluent-operator E0802 20:16:10.656875 1 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: secrets is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "secrets" in API group "" at the cluster scope
fluent-operator W0802 20:16:19.079865 1 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "statefulsets" in API group "apps" at the cluster scope
fluent-operator E0802 20:16:19.079921 1 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "statefulsets" in API group "apps" at the cluster scope
fluent-operator W0802 20:16:23.153158 1 reflector.go:424] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: failed to list *v1.ServiceAccount: serviceaccounts is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "serviceaccounts" in API group "" at the cluster scope
fluent-operator E0802 20:16:23.153227 1 reflector.go:140] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169: Failed to watch *v1.ServiceAccount: failed to list *v1.ServiceAccount: serviceaccounts is forbidden: User "system:serviceaccount:telemetry:fluent-operator" cannot list resource "serviceaccounts" in API group "" at the cluster scope
Is it possible to configure RBAC for Fluent Operator such that it doesn't need cluster-wide permissions on these resources while continuing to allow telemetry to be collected from all namespaces? If not, what functionality does the operator provide that requires such permissions?
How did you install fluent operator?
helm upgrade --install fluent-operator --create-namespace -n telemetry <chart_path>
Additional context
No response
That's a good idea. We first set it to the cluster level because essentially fluentbit is going to collect logs from all namespaces, tweak it for custom namespaces?
The operator shouldn't need full cluster scope access to daemonsets/statefulsets to collect logs from all namespaces, unless there's something I'm misunderstanding? It would only need those permissions in the namespace fluentbit/fluentd are deployed to. I can see that the operator does need cluster scope access for some plugins to collect from all namespaces so that it can grant those permissions to fluentbit/fluentd, such as the k8s plugin to get pod info.
Are there any use cases I'm maybe not considering that would require cluster scope permission on daemonsets/statefulsets from the operator?
Is there a work-around or solution for this issue? I'm encountering the same issue.
@boatski may I know if there is any update/solution to narrow down the minimum permission that fluent-operator needs?
Besides Fluent Operator, if I set the rbacrules in FluentBit as default, which might not be enough to collect the log files, I am setting it the same as Operator's, which is working smoothly, but I am sure that is over-granted.
Could you please help review this issue as it does bring the attention of our security team and blocks the current onboarding procedure? @cw-Guo
yeah, we should definitely review the namespace related set-ups for fluent-operator.
we should support namespace scoped install and also cluster install.
But this change is not trivial at all.
yeah, we should definitely review the namespace related set-ups for fluent-operator.
we should support namespace scoped install and also cluster install.
But this change is not trivial at all.
Thanks for the reply @cw-Guo , I am afraid I have to do some shrink locally to get the minimum required clusterroles if this is not on your priority list :(
Any movement on this request? We're seeing much higher memory usage by the fluent-operator in larger clusters where there are a significant number of secrets/configmaps to the point that we're having to double and even triple or more the memory limits on the operator due to OOMing. Aside from better memory management/garbage collection to keep this in check, it would seem like using a namespace scoped RBAC for the operator would help keep this overhead low by limiting the number of resources being cached in such scenarios.