gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

Support more sophisticated condition checking on mutation

Open ethernoy opened this issue 3 years ago • 12 comments

Describe the solution you'd like Support more sophisticated condition checking on mutation other than the implemented pathTest. Maybe we can use the existing rego engine to do such checking.

Anything else you would like to add: N/A

Environment: N/A

ethernoy avatar Sep 28 '21 06:09 ethernoy

Do you have a use case that is blocked by not having more sophisticated condition checking?

maxsmythe avatar Sep 28 '21 23:09 maxsmythe

Do you have a use case that is blocked by not having more sophisticated condition checking?

Yes. For instance, I wish to use gatekeeper to enforce a rule as following:

Deployment should only have 5Gi memory limit as maximum. If a user creates a deployment with memory limit > 5Gi, it will be overwritten with 5Gi.

ethernoy avatar Sep 29 '21 00:09 ethernoy

Thank you!

That seems like an interesting use case, though the Kubernetes docs do [warn against it]:(https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#use-caution-when-authoring-and-installing-mutating-webhooks)

  • Users may be confused when the objects they try to create are different from what they get back.
  • Built in control loops may break when the objects they try to create are different when read back.
    • Setting originally unset fields is less likely to cause problems than overwriting fields set in the original request. Avoid doing the latter.
  • Future changes to control loops for built-in resources or third-party resources may break webhooks that work well today. Even when the webhook installation API is finalized, not all possible webhook behaviors will be guaranteed to be supported indefinitely.

One particularly nasty potential issue that could occur with this use case... suppose you modified the memory usage of Pods instead of Deployments (seems reasonable, that way StatefulSet, etc. are also covered), the Deployment controller may continually kill the pod because the value of the memory request doesn't match the Deployment's config.

Another reason to be careful of such tests is that it makes it possible to inadvertently create cycles in which mutation never ends. We had a feature similar to this that we've decided to remove before beta, per #1544

A discussion that links to most of the relevant background discussions for the reasons why cycles are undesirable and the sorts of use-cases we're looking to unblock can be found here.

TL;DR please keep noting use cases like this as you find them. We should be mindful about how we support them given the implications, but as we learn more about how mutation behaves and is used, there may be a way forward.

In the meantime, validation would at least enforce that users themselves submit a correct value for their memory limit.

maxsmythe avatar Sep 29 '21 04:09 maxsmythe

A use case that we have is to label nodes with node-role.kubernetes.io/${role}=true based on a role=${role} label, because nodes cannot give themselves the node-role.kubernetes.io label.

For example:

» kubectl describe node example.local
Labels:             kubernetes.io/arch=amd64
                    kubernetes.io/os=linux
                    role=compute # get value of `role` label
                    node-role.kubernetes.io/compute=true # add this label

If a node has the role=compute label, then we want to add the label node-role.kubernetes.io/compute=true Currently we're using a Node Feature Discovery daemonset to achieve this. Being able to use gatekeeper mutations to do this would mean we can drop the Node Feature Discovery daemonset.

rblaine95 avatar Oct 14 '21 08:10 rblaine95

This isn't about Gatekeeper, but @rblaine95, it looks like you're trying to skirt KEP 279. Are you feeding these "role" labels to the kubelet's configuration? If so, what leads you to trust this "role" label, when this feature is forcing you to not trust labels set that way?

seh avatar Oct 14 '21 12:10 seh

We use AWS AutoScalingGroups and a Cluster Autoscaler to dynamically horizontally scale worker nodes based on pod specifications (such as resource requests, node selectors, affinities, tolerations)

We use different groups for different workloads, we pass some labels (like role=compute, role=spot, etc) as Kubelet arguments.

We use these labels to help in scheduling workloads. For example: Only applications that can handle interruptions and are Highly Available, or anything that isn't in a production namespace, are scheduled on role=spot Our analytics stack (namely spark workers) get a dedicated spot instance for each worker which reserves all the available memory and cpu, we use labels and tags to isolate billing on our AWS infrastructure.

The primary reason we want the node-role label is so that when we do kubectl get nodes we can, at a glance, see all the node roles in the returned output.

I understand that our use case probably isn't best practices and is rather niche. However, it does highlight a use case of adding labels to resources based of the value of other labels.

On Thu, 14 Oct 2021 at 14:52, Steven E. Harris @.***> wrote:

This isn't about Gatekeeper, but @rblaine95 https://github.com/rblaine95, it looks like you're trying to skirt KEP 279 https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/279-limit-node-access. Are you feeding these "role" labels to the kubelet's configuration? If so, what leads you to trust this "role" label, when this feature is forcing you to not trust labels set that way?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/open-policy-agent/gatekeeper/issues/1583#issuecomment-943326184, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA65K5CBENSRNLPDAPV5P4LUG3G7LANCNFSM5E4TT7MQ .

rblaine95 avatar Oct 14 '21 13:10 rblaine95

I understand the desire; we do the same. It's just that you can't trust labels like that as workload attractors (for use in pod affinity terms) if they're set by the kubelet. In many cases, that level of trust doesn't matter, such that even if an attacker gained access to the node and changed the labels fed to the kubelet, it wouldn't matter much to you.

seh avatar Oct 14 '21 14:10 seh

@rblaine95 Could you do this with a labelselector? Or are there enough node roles that this doesn't scale?

for example:

apiVersion: mutations.gatekeeper.sh/v1alpha1
kind: AssignMetadata
metadata:
  name: demo-annotation-owner
spec:
  match:
    labelSelector:
      matchLabels:
        role: "compute"
    kinds:
    - apiGroups: [""]
      kinds: ["Node"]
  location: "metadata.labels."node-role.kubernetes.io/compute"
  parameters:
    assign:
      value:  "true"

maxsmythe avatar Oct 15 '21 03:10 maxsmythe

@maxsmythe, why didn't I think about using label selectors? 🤦‍♂️

I tested the following:

apiVersion: mutations.gatekeeper.sh/v1alpha1
kind: AssignMetadata
metadata:
  name: node-role-compute
spec:
  match:
    scope: Cluster
    labelSelector:
      matchLabels:
        role: "compute"
    kinds:
    - apiGroups: [""]
      kinds: ["Node"]
  location: metadata.labels."node-role.kubernetes.io/compute"
  parameters:
    assign:
      value:  "true"

And added a new node, but after ±10 minutes, the node wasn't showing up when running kubectl get nodes Changing location to metadata.labels."foo.bar/baz", new nodes join the cluster nearly immediately. I bumped the gatekeeper logs to DEBUG but can't seem to find anything regarding an error.

However, I did notice that if I remove the mutation the node joins the cluster and I can see it in output of kubectl get nodes with a status of NotReady. If I then recreate the mutation the label is correctly added as soon as the Node resource gets updated (NotReady -> Ready) - I think something is blocking the Node resource from having that label when it gets created.

Is there a way to make the mutation only execute on update?

Edit: In other words, is it possible to make some mutations only run on update instead of update and create?

rblaine95 avatar Oct 15 '21 09:10 rblaine95

I think something is blocking the Node resource from having that label when it gets created.

That's likely the "NodeRestriction" admission plugin enforcing KEP 279, as your desired label's qualified name prefix has the "kubernetes.io" suffix.

seh avatar Oct 15 '21 13:10 seh

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Jul 23 '22 03:07 stale[bot]

Another use case for more sophisticated conditions on mutation. I want to inject spec.ttlSecondsAfterFinished in Jobs if the job is not owned by a CronJob (ie. let that controller handle the lifecycle). To do that I need to be able to specify a condition based on the ownerReferences in the metadata of the object. Right now I think I can only do a path check on the ttlSecondsAfterFinished field to see whether it exists or not. Now,technically I can do a labelSelector in the match criteria for !controller-uid (ie. target jobs where label does not exist), the label is appended if the job was created by a cronjob. This would work fine assuming users don't set the label controller-uid themselves on the job.

https://kubernetes.io/docs/concepts/workloads/controllers/ttlafterfinished/ https://open-policy-agent.github.io/gatekeeper/website/docs/mutation#conditionals

But there is no guarantee other mutations will have labels on them to use, so I think the ability to have more sophisticated conditions based on the objects metadata or spec is probably still warranted? It there are concerns about recursive mutations, rather than prevent the conditions from being created perhaps there could be some kind of internal logic to detect these conflicts ahead of time or as they are happening.

It would also be nice to also have conditions based on the values. I'd like to only inject if the ttlSecondsAfterFinished does not exist OR is greater than 3600 (1hr) and then assign a value of 3600, ideally it could also be one in one Mutation policy instead of 2. We'd need more conditional operators to be supported like these.

I understand from #1544 some of this may be looked at again after beta.

Do you have a use case that is blocked by not having more sophisticated condition checking?

Yes. For instance, I wish to use gatekeeper to enforce a rule as following:

Deployment should only have 5Gi memory limit as maximum. If a user creates a deployment with memory limit > 5Gi, it will be overwritten with 5Gi.

Could that not be covered by a limit range? It wouldn't overwrite the deployment itself but it would enforce it at the pod level. But if you wanted more sophisticated policies for different deployments or controllers then the limit range would not be flexible enough.

https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/memory-constraint-namespace/

ctrought avatar Aug 17 '22 19:08 ctrought

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 17 '22 05:10 stale[bot]

I have another situation where I would like to make a mutation based on the state of another variable.

if not spec.runtimeClassName: "gvisor" and if spec.securityContext.seccompProfile.type == MustNotExist then and only then Set : spec.securityContext.seccompProfile.type: "RuntimeDefault".

you may imagine that this rule is important to me, either the user uses gvisor or then a SECCOMP profile. (This mutation would set a default value whenever the user decides not to use gvisor). i can't see why exactly this rule would be a problem for the kube-aip in any way, shape or form, if the mutation hook implements something like this.

Well, I can understand the reasoning behind why the mutation hooks are so limited.

Has anyone found a workaround for such needs?

fuog avatar Dec 15 '22 07:12 fuog