actions-runner-controller icon indicating copy to clipboard operation
actions-runner-controller copied to clipboard

chart: Limiting RABC permissions to the watch namespace when specified

Open danyoungwu opened this issue 3 years ago • 5 comments

Controller Version

0.22.0

Helm Chart Version

No response

Deployment Method

Other

Checks

  • [X] This isn't a question or user support case (For Q&A and community support, go to Discussions. It might also be a good idea to contract with any of contributors and maintainers if your business is so critical and therefore you need priority support
  • [X] I've read releasenotes before submitting this issue and I'm sure it's not due to any recently-introduced backward-incompatible changes
  • [X] My actions-runner-controller version (v0.x.y) does support the feature
  • [X] I've already upgraded ARC to the latest and it didn't fix the issue

Resource Definitions

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    control-plane: controller-manager
  name: controller-manager
  namespace: actions-runner-system
spec:
  replicas: 1
  selector:
    matchLabels:
      control-plane: controller-manager
  template:
    metadata:
      labels:
        control-plane: controller-manager
    spec:
      serviceAccountName: actions-runner-system-admin
      containers:
      - args:
        - --metrics-addr=127.0.0.1:8080
        - --enable-leader-election
        - --sync-period=10m
        command:
        - /manager
        env:
        - name: GITHUB_TOKEN
          valueFrom:
            secretKeyRef:
              key: github_token
              name: controller-manager
              optional: true
        - name: GITHUB_APP_ID
          valueFrom:
            secretKeyRef:
              key: github_app_id
              name: controller-manager
              optional: true
        - name: GITHUB_APP_INSTALLATION_ID
          valueFrom:
            secretKeyRef:
              key: github_app_installation_id
              name: controller-manager
              optional: true
        - name: GITHUB_APP_PRIVATE_KEY
          value: /etc/actions-runner-controller/github_app_private_key
        image: devtools-docker-local.decc-af.eng.vmware.com/wud/summerwind/actions-runner-controller:v0.22.0
        name: manager
        ports:
        - containerPort: 9443
          name: webhook-server
          protocol: TCP
        resources:
          limits:
            cpu: 100m
            memory: 100Mi
          requests:
            cpu: 100m
            memory: 20Mi
        volumeMounts:
        - mountPath: /tmp/k8s-webhook-server/serving-certs
          name: cert
          readOnly: true
        - mountPath: /etc/actions-runner-controller
          name: controller-manager
          readOnly: true
      - args:
        - --secure-listen-address=0.0.0.0:8443
        - --upstream=http://127.0.0.1:8080/
        - --logtostderr=true
        - --v=10
        - --watch-namespace actions-runner-system
        image: harbor-repo.vmware.com/dockerhub-proxy-cache/bitnami/kube-rbac-proxy:0.11.0
        name: kube-rbac-proxy
        ports:
        - containerPort: 8443
          name: https
      terminationGracePeriodSeconds: 10
      volumes:
      - name: cert
        secret:
          defaultMode: 420
          secretName: webhook-server-cert
      - name: controller-manager
        secret:
          secretName: controller-manager

To Reproduce

Refer https://github.com/actions-runner-controller/actions-runner-controller/releases/download/v0.22.0/actions-runner-controller.yaml
1. deploy all the CustomResourceDefinition 
2. deploy namespace actions-runner-system
3. deploy Role, ClusterRole, RoleBinding in v0.22.0/actions-runner-controller.yaml, change the ClusterRoleBinding to RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: manager-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: manager-role
subjects:
- kind: ServiceAccount
  name: default
  namespace: actions-runner-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: proxy-rolebinding
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: proxy-role
subjects:
- kind: ServiceAccount
  name: default
  namespace: actions-runner-system
4. deploy Deployment controller-manager with '--watch-namespace actions-runner-system'

Describe the bug

When --watch-namespace is set, controller still wants permissions on the cluster side. Could controller performs on the namespace level with '-watch-namespace'?

Describe the expected behavior

controller performs on the namespace level with '-watch-namespace', seek for permission in namespaced instead of cluster scope

Controller Logs

E0513 19:38:31.034204       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1alpha1.RunnerReplicaSet: failed to list *v1alpha1.RunnerReplicaSet: runnerreplicasets.actions.summerwind.dev is forbidden: User "system:serviceaccount:actions-runner-system:actions-runner-system-admin" cannot list resource "runnerreplicasets" in API group "actions.summerwind.dev" at the cluster scope
W0513 19:39:00.209867       1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:actions-runner-system:actions-runner-system-admin" cannot list resource "statefulsets" in API group "apps" at the cluster scope
E0513 19:39:00.209913       1 reflector.go:138] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: Failed to watch *v1.StatefulSet: failed to list *v1.StatefulSet: statefulsets.apps is forbidden: User "system:serviceaccount:actions-runner-system:actions-runner-system-admin" cannot list resource "statefulsets" in API group "apps" at the cluster scope
W0513 19:39:21.288471       1 reflector.go:324] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:167: failed to list *v1alpha1.RunnerReplicaSet: runnerreplicasets.actions.summerwind.dev is forbidden: User "system:serviceaccount:actions-runner-system:actions-runner-system-admin" cannot list resource "runnerreplicasets" in API group "actions.summerwind.dev" at the cluster scope

Runner Pod Logs

None

Additional Context

No response

danyoungwu avatar May 17 '22 18:05 danyoungwu

@danyoungwu Good catch!

This isn't a bug, as the feature to differentiate RBAC resources' scope according to the watch namespace has never been implemented. But it definitely sounds like a valid feature request!

If you're familiar with how RBAC works and helm charts, it would be great if you could consider submitting a pull request to add it!

AFAIK, it should mostly be a matter of whether to use RoleBinding or ClusterRoleBinding to bind ARC's ClusterRole to ARC's ServiceAccount.

In case it was bound via ClusterRoleBinding as we do today (https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/templates/manager_role_binding.yaml and https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/templates/githubwebhook.role_binding.yaml) it should gain permissions across all namespaces(=cluster-wide as you said).

So, we should selectively use RoleBinding created within the watch namespace, so that ARC gains permissions only within the watch namespace.

Similarly, we'd need to selectly use RoleBinding for kube-rbac-proxy, too: https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/templates/auth_proxy_role_binding.yaml

In other words, we won't need to change https://github.com/actions-runner-controller/actions-runner-controller/blob/master/charts/actions-runner-controller/templates/manager_role.yaml or create a Role to achieve our goal here. Just my two cents.

mumoshu avatar May 18 '22 00:05 mumoshu

@mumoshu While dynamically switching from the ClusterRoleBinding to a RoleBinding is important for ease of use, that can be done by the users manually if needed. What isn't easily fixed by the users is the fact that actions-runner-controller is trying to query objects on the cluster level instead of the namespace level when --watch-namespace is specified. Someone needs to change the code of the controller to query for runnerreplicasets and statefulsets on the namespace level when --watch-namespace is specified.

alex-vmw avatar May 18 '22 05:05 alex-vmw

@alex-vmw Ah, thanks for clarifying! That makes sense. I thought there were a few places we call the List func without limiting the namespace so it might need to be updated.

mumoshu avatar May 18 '22 05:05 mumoshu

@mumoshu Do you know if someone can produce a fix for this (on the controller side, not ClusterRoleBinding/RoleBinding side) as we don't have Go expertise? We have a number of urgent use-cases where users only have access to namespaced k8s objects, but want to run actions-runner-controller. Thank you in advance.

alex-vmw avatar May 20 '22 19:05 alex-vmw

@alex-vmw Hey. Unfortunately we won't be implementing this soon and there's no estimated EDD, because we are currently working on releasing 0.24.0 and we already have all the tasks occupied up until the release after the next: https://github.com/actions-runner-controller/actions-runner-controller/milestones

Also- is it urgent enough that you can consider sponsoring our work? It will help me afford more time to work on this project and therefore you'll likely be able to receive updates and releases earlier.

mumoshu avatar May 20 '22 22:05 mumoshu

@mumoshu Do you know if someone can produce a fix for this (on the controller side, not ClusterRoleBinding/RoleBinding side) as we don't have Go expertise? We have a number of urgent use-cases where users only have access to namespaced k8s objects, but want to run actions-runner-controller. Thank you in advance.

I couldn't agree more. This is very common to have a restricted environment where the ClusterRole is not easily granted. Even the CRDs themselves are are not so easily accepted, but that can be exceptionally pushed through. However, granting the cluster-wide permissions are highly suspicious in the corporate world. It would be great if there was a possibility to reassess this point. I'm sure there are plenty of companies hindered by this out there. It would be even better if there was an alternative to CRDs, so you could solely rely on resources constrained to a given namespace. However, I'm not expecting to rewrite the whole thing just for that ;-)

@mumoshu : first of all - what a fantastic job you guys are doing with ARC! That's truly commendable. What level of support you would accept for implementing such request? Are you also interested in external contributions? Myself I have some experience in GO, but I wouldn't say I'm a pro here. However, there is a chance I could tempt some of my mates from the company to chip in and help out with this project.

wherka-ama avatar Jan 24 '23 15:01 wherka-ama

Fixed by https://github.com/actions/actions-runner-controller/pull/2374?

uhthomas avatar Mar 09 '23 15:03 uhthomas

Fixed by #2374?

I would say - most likely. Just one assumption: a current version of the chat is deprecated. The new one looks most definitely better to me.

We can wait for @TingluoHuang or @mumoshu to confirm.

wherka-ama avatar Mar 09 '23 15:03 wherka-ama