vcluster
vcluster copied to clipboard
CVE-2023-1260 attempt to create RBAC not currently held in openshift cluster: pods/ephemeralcontainers pods/status
What happened?
Deployting vcluster on OpenShift cluster that has https://bugzilla.redhat.com/show_bug.cgi?id=2176267 fixed (meaning restricting some rbac) will let to
warning: Upgrade "my-vcluster" failed: failed to create resource: roles.rbac.authorization.k8s.io "vc-my-vcluster" is forbidden: user "antoinetran" (groups=["2004833" "system:authenticated:oauth" "system:authenticated"]) is attempting to grant RBAC permissions not currently held:
{APIGroups:[""], Resources:["pods/ephemeralcontainers"], Verbs:["patch" "update"]}
{APIGroups:[""], Resources:["pods/status"], Verbs:["patch" "update"]}
What did you expect to happen?
Deployment vcluster OK
How can we reproduce it (as minimally and precisely as possible)?
- Deploy an Kubernetes cluster
- Create a serviceAccount user, not admin, without RBAC:
{APIGroups:[""], Resources:["pods/ephemeralcontainers"], Verbs:["patch" "update"]}
{APIGroups:[""], Resources:["pods/status"], Verbs:["patch" "update"]}
- Deploy vcluster using that serviceAccount
Anything else we need to know?
No response
Host cluster Kubernetes version
$ kubectl version
Client Version: v1.28.15
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.2-3580+6216ea1e51a212-dirty
vcluster version
$ vcluster --version
vcluster version 0.23.0
VCluster Config
# My vcluster.yaml / values.yaml here
Suggestion of solution: I can do a pullrequest removing https://github.com/loft-sh/vcluster/blob/v0.23.0/chart/templates/role.yaml#L31 by default:
- apiGroups: [""]
resources: ["pods/status", "pods/ephemeralcontainers"]
verbs: ["patch", "update"]
Workaround: values.yaml
rbac:
# Role holds virtual cluster role configuration
role:
# Enabled defines if the role should be enabled or disabled.
enabled: true
# Error:
# * roles.rbac.authorization.k8s.io "vc-my-vcluster" is forbidden: user "antoinetran" (groups=["2004833" "system:authenticated:oauth" "system:authenticated"]) is attempting to grant RBAC permissions n
# ot currently held:
# {APIGroups:[""], Resources:["pods/ephemeralcontainers"], Verbs:["patch" "update"]}
# {APIGroups:[""], Resources:["pods/status"], Verbs:["patch" "update"]}
# * roles.rbac.authorization.k8s.io "vc-my-vcluster" not found
overwriteRules:
- apiGroups: [""]
resources: ["configmaps", "secrets", "services", "pods", "pods/attach", "pods/portforward", "pods/exec", "persistentvolumeclaims"]
verbs: ["create", "delete", "patch", "update", "get", "list", "watch"]
# See https://bugzilla.redhat.com/show_bug.cgi?id=2176267
#- apiGroups: [""]
# resources: ["pods/status", "pods/ephemeralcontainers"]
# verbs: ["patch", "update"]
- apiGroups: ["apps"]
resources: ["statefulsets", "replicasets", "deployments"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["endpoints", "events", "pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: ["networking.k8s.io"]
resources: ["ingresses"]
verbs: ["create", "delete", "patch", "update", "get", "list", "watch"]
After deploying the cluster and removed the RBAC patch/update as written above, I can now deploy the cluster with my helm components. However I get errors like this in events:
ingress-nginx 9m18s Warning SyncError pod/my-nginx-jupyter-75b859dcdc-lb6bf Error syncing: patch host object: update object status: pods "my-nginx-jupyter-75b859dcdc-lb6bf-x-ingress-nginx-x-my-vcluster" is forbidden: User "system:serviceaccount:REDACTED:vc-my-vcluster" cannot update resource "pods/status" in API group "" in the namespace "REDACTED"
It seems the patch function is heavily used by vcluster during its sync function. In previous version v0.20.0-beta1, it did not seem to be the case? Do I have to ask my cluster admin to lower the security by allowing me with these RBAC? Or can we make vcluster work without patch/update?
Now I am asking https://bugzilla.redhat.com/show_bug.cgi?id=2176267 if these privilege can be given to users in fixed version of OpenShift (>=4.11).
After deploying the cluster and removed the RBAC patch/update as written above, I can now deploy the cluster with my helm components. However I get errors like this in events:
ingress-nginx 9m18s Warning SyncError pod/my-nginx-jupyter-75b859dcdc-lb6bf Error syncing: patch host object: update object status: pods "my-nginx-jupyter-75b859dcdc-lb6bf-x-ingress-nginx-x-my-vcluster" is forbidden: User "system:serviceaccount:REDACTED:vc-my-vcluster" cannot update resource "pods/status" in API group "" in the namespace "REDACTED"
This error is triggered by https://github.com/loft-sh/vcluster/blob/main/pkg/patcher/apply.go#L253 I read the code but I don't know what is the reason behind this behavior: when a pod is created in vcluster, the pod is also created in host cluster, but first without status, and then with status only, through patch. In fact, I don't know why vcluster need to patch the status in this direction vcluster to host. I think only the direction host to vcluster has a sense for me. Currently, the deployment I did in a cluster without patch/update privilege trigger a few errors each time I create a pod, but it seems this has no consequence because everything is running fine, which is why I believe this behavior of patch status not useful?
Unfortunately, vCluster relies and needs those permissions and cannot be removed by default.
- apiGroups: [""]
resources: ["pods/status", "pods/ephemeralcontainers"]
verbs: ["patch", "update"]
Unfortunately, vCluster relies and needs those permissions and cannot be removed by default.
- apiGroups: [""] resources: ["pods/status", "pods/ephemeralcontainers"] verbs: ["patch", "update"]
Hi @deniseschannon , but I don't see what for? I am running in a OpenShift cluster, which don't provide them by default, and I don't see anything wrong without these. It would be great for OpenShift compatibility if we are sure we really these RBAC, or remove them if there are not necessary, so that it works by default :)
Let me add that vcluster has an absolutely great value especially on OpenShift environment, where only one namespace and limited rights is given to a project. In fact, this is the only use case I am using it for!
You can workaround this limitation by using a tool that will provision virtual clusters on behalf of a user, e.g. vCluster Platform or CAPI provider.
You can workaround this limitation by using a tool that will provision virtual clusters on behalf of a user, e.g. vCluster Platform or CAPI provider.
I don't understand your answer. I already provided a workaround (see above in first post). The link your provided is about vcluster cloud.
I was trying to say that if you want to create a vCluster as a user that doesn't have all the necessary RBAC permissions, you can use on the referenced tools.
Feel free to suggest a change that would incorporate your workaround, either a PR, or start with proposing a new config option for vcluster.yaml (chart values).
Feel free to suggest a change that would incorporate your workaround, either a PR, or start with proposing a new config option for vcluster.yaml (chart values).
After a week or two of using vcluster without these RBAC, in an OpenShift cluster, I can say it still works fine! So I still believe these RBAC can be deleted.
I can work on removing the RBAC on the helm chart, but removing the patch on vcluster code is another level of work!
So after investigating:
- the patch/update of "pods/status" is used at https://github.com/loft-sh/vcluster/blob/v0.28.0-next.11/pkg/patcher/apply.go#L253 But I don't know why we do that
- the patch/update of "pods/ephemeralcontainers" is used at https://github.com/loft-sh/vcluster/blob/v0.28.0-next.11/pkg/controllers/resources/pods/ephemeral_containers.go#L67 . The RBAC is needed when one adds an ephemeral container after a pod is created. vcluster needs to update the pod at host level.
Anyway, we can remove these RBAC in OpenShift clusters manually with the above workaround. But I am not sure removing in code is the right solution. I tested removing these RBAC for a long time and don't see any regression, but I don't use ephemeral containers so maybe that's why.
I think the best course of action is for upstream OpenShift to allow these RBAC instead of removing them when fixing the CVE. I hope this ticket gets fixed: https://bugzilla.redhat.com/show_bug.cgi?id=2349782
Another action is to allow the chart to not deploy these RBAC as an optional field. I added a configuration in this pull request. https://github.com/loft-sh/vcluster/pull/3084