rancher icon indicating copy to clipboard operation
rancher copied to clipboard

[BUG] Rancher RoleTemplate CRD does not validate roles properly

Open samjustus opened this issue 2 years ago • 14 comments

Rancher Server Setup

  • Rancher version: 2.6.9

Describe the bug Rancher RoleTemplate CRD does not validate roles properly.

To Reproduce Create a new RoleTemplate CR such as:

apiVersion: management.cattle.io/v3 builtin: false context: project description: "Test" displayName: Test Project Role external: false hidden: false kind: RoleTemplate metadata: name: test-project-role projectCreatorDefault: false rules:

apiGroups: [] resources: namespaces verbs: create

Result Kubernetes accepts the CR and then Rancher begins failing to apply that Project (or Cluster) Role:

2022/11/30 22:46:14 [ERROR] error syncing 'ztiap-sts': handler namespace-auth: couldn't ensure roles: couldn't update clusterRole test-project-role: couldn't update clusterRole test-project-role: ClusterRole.rbac.authorization.k8s.io "test-project-role" is invalid: rules[0].apiGroups: Required value: resource rules must supply at least one api group, requeuing

I suspect that the Rancher UI is validating that the RoleTemplate is created properly. That validation should be moved into the Kubernetes API rather than the client.

Expected Result Kubernetes rejects the CR creation because the list of rules contains an invalid rule (the apiGroups is an empty list)

SURE-5702

samjustus avatar Feb 16 '23 21:02 samjustus

Quoting a piece from SURE-5702 which did not make it here

I suspect that the Rancher UI is validating that the RoleTemplate is created properly. That validation should be moved into the Kubernetes API rather than the client.

The implication that the CR was created via kubectl, and not the RUI, bypassing the suspected validation done by the UI.

Validation in the kube API :arrow_right: webhook. Grepping filenames for roletemplate delivers

./pkg/resources/management.cattle.io/v3/roletemplate/validator.go

as a possible starting point. Which contains validateCreateFields called from the validators Admit. Nothing in that code seems to look at the rules.

:warning: The validateUpdateFields does not seems to prevent editing of the rules of an RT, i.e. to empty apiGroups, or verbs. Although comments somewhere else seem to indicate that this would happen.

andreas-kupries avatar Jan 26 '24 09:01 andreas-kupries

Reading more the verb check is in Admit itself, and not the validate... functions.

andreas-kupries avatar Jan 26 '24 09:01 andreas-kupries

The verb check is done via CheckForVerbs, in the common support for validators. It seems to me that an analogous CheckForAPIGroups should do what is wanted.

andreas-kupries avatar Jan 26 '24 09:01 andreas-kupries

A bit more from SURE about reproduction

One note that wasn't on the issue - you need to bind a user to the impacted role template in a project or you won't see an error in the logs (I believe we don't attempt to create backing roles until someone actually has need of those rules).

And about possible solutions

As far as the solution goes, we might be able to get k8s to do the heavy lifting for us if we can get our CRDs to be well-defined, which would avoid custom webhook logic.

What I have done so far was going for the webhook and custom logic.

Researching into the CRD spec direction the CRD looks to be defined via

  • ./pkg/crds/yaml/generated/management.cattle.io_roletemplates.yaml

and code under pkg/crds/... installs these yaml files into the cluster on setup.

The files are apparently embedded into the rancher binary. The exact method is actually not relevant to repro and fix. The files are apparently also generated from somewhere. This will be relevant to the fix, as the sources would have to be modified. The location of these sources is currently unknown :warning: :question:

The spec is based on openAPIV3Schema.

  • Starting at (googled: kubernetes crd spec) https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions
  • It references (at the top) https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.29/#customresourcedefinition-v1-apiextensions-k8s-io
  • Following field type references yields https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.29/#jsonschemaprops-v1-apiextensions-k8s-io
  • The minLength field listed there may be what is needed to constrain the apiGroups property to length 1 and longer.

The existing apiGroups spec is

apiGroups:
  description: APIGroups is the name of the APIGroup that contains
    the resources.  If multiple API groups are specified, any action
    requested against one of the enumerated resources in any API group
    will be allowed. "" represents the core API group and "*" represents
    all API groups.
  items:
    type: string
  type: array

It would have to be changed to

apiGroups:
  __minLength: 1__
  description: APIGroups is the name of the APIGroup that contains
    the resources.  If multiple API groups are specified, any action
    requested against one of the enumerated resources in any API group
    will be allowed. "" represents the core API group and "*" represents
    all API groups.
  items:
    type: string
  type: array

Notes:

  • I wonder if the existing checkForVerbs code could be replaced by an analogous constraint on the verbs array .
  • Same for the resources array.

andreas-kupries avatar Jan 26 '24 10:01 andreas-kupries

Ok. Repro succesful:

2024/01/26 13:17:14 [ERROR] error syncing 'p-p2nxh/prtb-cz899': handler cluster-prtb-sync: couldn't ensure roles: couldn't create clusterRole test-gh-40584-role: ClusterRole.rbac.authorization.k8s.io "test-gh-40584-role" is invalid: rules[0].apiGroups: Required value: resource rules must supply at least one api group, requeuing

After figuring out how to bind a user to the impacted role template:

UI:

  1. Create a test user.
  2. Go to local Cluster - Projects/Namespaces - Create project. As part of that
  3. Add the user, and select the bogus role template as the users permissions.
  4. Done

andreas-kupries avatar Jan 26 '24 12:01 andreas-kupries

Fix with custom code (CheckForAPIGroups analogous to CheckForVerbs) is successful:

work@tagetarl:~/SUSE/dev/Rancher> k apply -f 40584-roletemplate.yaml Error from server (BadRequest): error when creating "40584-roletemplate.yaml": admission webhook "rancher.cattle.io.roletemplates.management.cattle.io" denied the request: policyRules must have at least one APIgroup: &PolicyRule{Verbs:[create],APIGroups:[],Resources:[namespaces],ResourceNames:[],NonResourceURLs:[],}

andreas-kupries avatar Jan 26 '24 12:01 andreas-kupries

Fix with a change to the CRD is successful:

work@tagetarl:~/SUSE/dev/Rancher> k apply -f 40584-roletemplate.yaml kubectl apply -f 40584-roletemplate.yaml The RoleTemplate "test-gh-40584-role" is invalid: rules[0].apiGroups: Invalid value: 0: rules[0].apiGroups in body should have at least 1 items

Note, the minLength mentioned above is the wrong constraint. The actual constraint is minItems.

andreas-kupries avatar Jan 26 '24 13:01 andreas-kupries

With both possible fixes working I have questions:

  1. For one, which of the two is prefered ? a. Code change, or b. CRD change
  2. If the second, then there are open questions about the setup for the rancher CRDs: a. The CRD resides in pkg/crds/yaml/generated/management.cattle.io_roletemplates.yaml. b. As this file is apparently generated, where are the sources it was generated from ? c. And how would it be generated ? d. Note, depending on what the CRD yaml is generated from it may not be possible to set the desired constraint there. This would force us back to the code change in the webhook.

andreas-kupries avatar Jan 26 '24 13:01 andreas-kupries

From comms in slack thread, by @MbolotSuse

  • The source for the role template CRD can be found here: https://github.com/rancher/rancher/blob/abc87f5240fd6799b3f485106638f967373a736c/pkg/apis/management.cattle.io/v3/authz_types.go#L216
  • After editing that file you should use go generate ./... to regenerate the CRDs (LMK if that doesn't work - you might also be able to use make build-crds for this purpose).
  • I would recommend putting this check in the webhook.
    • The API Groups check is not the only one performed by k8s.
    • We want to make sure that we mirror all of the checks that k8s uses so that we don't see this issue surface in other ways (such as by providing no verbs, or no resources).
    • The webhook is also a bit easier to test than CRD level validation.
  • @Jonathan Crowther recently made a change for GlobalRoles which made sure those rules were valid according to upstream: https://github.com/rancher/webhook/blob/6d6130f5fe1f70db905a851225d4929855e6dcf2/pkg/resources/management.cattle.io/v3/globalrole/validator.go#L262. I would recommend implementing similar validation for your type.

Me:

Wrt resources it seems that the hook does not check that either. verbs was the only thing check AFAICT.

@MbolotSuse

Yep, I would expect the webhook check to miss a few validations on the rules right now. That's why I linked that code to the GlobalRole stuff - that function uses an upstream library which should match what k8s is looking for 1-1, so that should let you fix the webhook to get a lot without needing to get too deep into the finer details. (edited)

Me:

Ah :+1: Looking at the authz_types.go I see that the type for the rules property comes straight out of the kube rbac types, and these do not contain any constraints wrt minItems, so that explains why whatever tool called by go generate is not adding such to its output.

andreas-kupries avatar Jan 26 '24 14:01 andreas-kupries

The global role validator changes came from

  • https://github.com/rancher/webhook/commit/b18c38740a52623e44539f7b5ed6eb02c2f53306
  • I.e. https://github.com/rancher/webhook/pull/309

andreas-kupries avatar Jan 26 '24 14:01 andreas-kupries

Back to In Progress. While the PR got the requisite approvals a comment pointed to integration tests not seen, which should be extended as well.

andreas-kupries avatar Jan 31 '24 15:01 andreas-kupries

Reported an issue in v2.8 related to requeing error messages with missing annotations on a RoleTemplate.

[BUG] : Requeing error messages in Logs when user create RoleTemplates without annotations https://github.com/rancher/rancher/issues/45365

dasarinaidu avatar May 03 '24 14:05 dasarinaidu

Observation 4: [BUG] : Wrong Inherit RoleTemplates throwing Internal Error exception from webhook https://github.com/rancher/rancher/issues/45371

Note: This is an existing issue

dasarinaidu avatar May 03 '24 18:05 dasarinaidu

Observation 5 : [BUG] : Inherit RoleTemplates throwing Error exception from webhook without roletemplate name https://github.com/rancher/rancher/issues/45372

dasarinaidu avatar May 03 '24 19:05 dasarinaidu

Completed validations on v2.9-head, no new issues reported and closing this issue. v2.9-3c4ccdc5bc9fde3510089153b5ad58fdbe604880-head

dasarinaidu avatar Jul 16 '24 00:07 dasarinaidu