gatekeeper icon indicating copy to clipboard operation
gatekeeper copied to clipboard

Mutate restrict policy to specific verb (create, update)

Open ctml91 opened this issue 2 years ago • 32 comments

Describe the solution you'd like The mutation CRDs should have an option to restrict the policy to certain verbs... either create, update, or both.

Anything else you would like to add: As an example, when a mutate policy is active for a pod to assign certain fields in its spec some kubernetes operators actually need to be able to label existing pods. When the GK policy is active its enforcing for create and updates to the pod which prevents the operator managing the pods from labelling the pod because it attempts to mutate it to add the spec during an update which is not allowed. There are certain fields that can be changed in a pod during an update, including labels, and when it is blocked the update to the pod will throw errors like below.

spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds`

Environment:

  • Gatekeeper version: 3.9
  • Kubernetes version: (use kubectl version): 1.23.5

ctml91 avatar Jul 28 '22 13:07 ctml91

Tried to work around the issue by using a condition MustNotExist for nodeName in the pod spec to only apply when it does not exist (ie. pod is not scheduled and therefore is new/create) but apparently the conditions can only be on a prefix of the targeted mutation field.

ctml91 avatar Jul 28 '22 14:07 ctml91

So updating a pod can lead to mutation attempting to set fields that K8s doesn't allow updating on?

Interesting. I'm guessing this is for newly defined mutators? Given the pod would have already had the fields set if it had been created beneath the mutator?

The match criteria is a first instinct for something like this, but that puts us in new terrain since such a selector would be edge-based (i.e. request method is only available at k8s admission time, not at shift-left or as part of the soon-to-be-committed generator resource validation code: #2062 ). In a shift-left/audit use case, it's not always possible to infer what the correct operation would be (whether the resource exists shift-right would control whether the resource is being updated, and there the 3-way merge would be important behavior for validation since it would provide previously-defaulted fields).

Also, if we used request method for a selector, other metadata like requesting user and request time would be natural next steps, which could cause us to over-fit on admission-time at the expense of shift-left and audit.

With validation, this is less of an issue since Rego can fill in the gaps.

I think the right thing to do here is to leave match alone and add some kind of metadata selector that's internal to mutations. To handle non-admission use cases, it should either have well-defined, or user-customizable behavior for when the relevant metadata is missing.

maxsmythe avatar Jul 30 '22 02:07 maxsmythe

Interesting. I'm guessing this is for newly defined mutators? Given the pod would have already had the fields set if it had been created beneath the mutator?

Yes that's correct it affects existing pods that are still subject to newly defined mutators during updates. I suppose it could be true for any k8s object where a subset of fields become immutable after creation (e.g. statefulsets have some)

ctml91 avatar Jul 30 '22 02:07 ctml91

Ack, thanks for confirming!

@ritazh @sozercan any objections to the proposed solution?

maxsmythe avatar Jul 30 '22 03:07 maxsmythe

I agree the verb should not be part of the match criteria but can be used as part of the mutation logic, which is similar to validation today, you can check the verb as part of the rego explicitly. Are you thinking it would be a new field in the mutation CRD?

ritazh avatar Jul 30 '22 14:07 ritazh

Yeah, a new field in the mutation CRD sounds right.

@ctml91 one concern I have is that only applying the mutation on "create" would cause the same problem in reverse... do we know if the mutation NOT applying on update will cause the field to be missing, which would cause K8s to infer deletion?

maxsmythe avatar Aug 01 '22 21:08 maxsmythe

Yeah, a new field in the mutation CRD sounds right.

@ctml91 one concern I have is that only applying the mutation on "create" would cause the same problem in reverse... do we know if the mutation NOT applying on update will cause the field to be missing, which would cause K8s to infer deletion?

Yes I guess it could, but then it's probably for the user to decide whether that should be possible depending if the scope is limited to only create or not? Unless it is somehow possible to know which fields of the API being mutated are immutable for updates so GK can ignore them... but I think that is probably not possible.

Also the merge strategy for a patch would dictate whether that field is removed or not during an update.

ctml91 avatar Aug 01 '22 23:08 ctml91

Whether the delete becomes inferred depends on whether the field ownership is set via server-side-apply when mutations are applied:

https://kubernetes.io/docs/reference/using-api/server-side-apply/#introduction

Does the mutated field show up in metadata.managedFields for the relevant account?

If so then more design work would be needed since being able to select only-on-create wouldn't solve your issue but make it more prevalent. Putting it to the user to define the behavior is good, but we'd then need to make sure we've given the necessary knobs and have done so in a way that minimizes complexity and risk of causing future issues (operationally or with respect to API design).

If field owner is set and that triggers the deletion issue, we'd probably need to do something that extracts the value from the original field and re-adds it, with maybe some additional nuance around checking managedFields to avoid issues on resources with multiple owners.

maxsmythe avatar Aug 02 '22 00:08 maxsmythe

If field owner is set and that triggers the deletion issue, we'd probably need to do something that extracts the value from the original field and re-adds it, with maybe some additional nuance around checking managedFields to avoid issues on resources with multiple owners.

^^^ the above would also conflict with only-on-create, as this is behavior that would need to exist for update, so we may be looking at a different knob.

@ctml91 Would it be possible to paste a copy of the mutated resource and relevant mutator?

maxsmythe avatar Aug 02 '22 00:08 maxsmythe

If field owner is set and that triggers the deletion issue, we'd probably need to do something that extracts the value from the original field and re-adds it, with maybe some additional nuance around checking managedFields to avoid issues on resources with multiple owners.

^^^ the above would also conflict with only-on-create, as this is behavior that would need to exist for update, so we may be looking at a different knob.

@ctml91 Would it be possible to paste a copy of the mutated resource and relevant mutator?

apiVersion: mutations.gatekeeper.sh/v1alpha1
kind: Assign
metadata:
  name: pod-fsgroupchangepolicy
spec:
  location: "spec.securityContext.fsGroupChangePolicy"
  applyTo:
  - groups: [""]
    kinds: ["Pod"]
    versions: ["v1"]
  match:
    kinds:
    - apiGroups:
      - '*'
      kinds:
      - Pod
    namespaceSelector:
      matchExpressions:
      - key: policies.io/fsgroupchangepolicy
        operator: Exists
    scope: Namespaced
  parameters:
    assign:
      value: "OnRootMismatch

You can reproduce by applying the above mutator, and then trying to label an existing pod (which works otherwise). kubectl label pod <pod> test=true

We also encountered some issues with Kubernetes garbage collection, I guess because it needs to update the pod in some cases. This results in the pod being appearing to be stuck in a terminating state indefinitely even though Kubernetes successfully terminated on the pod. It seems to be pretty rare, I don't know why it doesn't always happen. Something to do with the owner references when the pod is being terminated. Below were the logs we saw in kube-controller-manager. Deleting the policy allowed the pod to be garbage collected.

E0803 16:28:25.868637 1 garbagecollector.go:347] error syncing item &garbagecollector.node{identity:garbagecollector.objectReference{OwnerReference:v1.OwnerReference{APIVersion:"v1", Kind:"Pod", Name:"pod-name", UID:"cc9bc7f9-aaf9-416d-ab24-345d743a1549", Controller:(*bool)(nil), BlockOwnerDeletion:(*bool)(nil)}, Namespace:"namespace"}, dependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:1, readerWait:0}, dependents:map[*garbagecollector.node]struct {}{}, deletingDependents:true, deletingDependentsLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, beingDeleted:true, beingDeletedLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, virtual:false, virtualLock:sync.RWMutex{w:sync.Mutex{state:0, sema:0x0}, writerSem:0x0, readerSem:0x0, readerCount:0, readerWait:0}, owners:[]v1.OwnerReference{v1.OwnerReference{APIVersion:"apps/v1", Kind:"StatefulSet", Name:"statefulset-name", UID:"36575c4c-0a79-4ddb-a4d3-216876bf8908", Controller:(*bool)(0xc024999dcf), BlockOwnerDeletion:(*bool)(0xc024999df0)}}}: Pod "pod-name" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds`, `spec.tolerations` (only additions to existing tolerations) or `spec.terminationGracePeriodSeconds` (allow it to be set to 1 if it was previously negative)

ctml91 avatar Aug 03 '22 20:08 ctml91

WRT deletion... interesting. I wonder if that has something to do with the use of foreground deletion.

It doesn't look like mutated fields show up in the kubectl.kubernetes.io/last-applied-configuration (client-side-apply), which means client-side-apply has no issues with fields added outside the original intent, however I used the following mutator and resource:

apiVersion: mutations.gatekeeper.sh/v1alpha1
kind: Assign
metadata:
  name: pod-fsgroupchangepolicy
spec:
  location: "spec.containers[name: opa].resources.limits.cpu"
  applyTo:
  - groups: [""]
    kinds: ["Pod"]
    versions: ["v1"]
  match:
    kinds:
    - apiGroups:
      - '*'
      kinds:
      - Pod
    namespaceSelector:
      matchExpressions:
      - key: policies.io/fsgroupchangepolicy
        operator: Exists
    scope: Namespaced
  parameters:
    assign:
      value: "1"
apiVersion: v1
kind: Pod
metadata:
  name: opa-post-mutation
  namespace: production
  labels:
    owner: me.agilebank.demo
    test2: "alsovalid"
    mutationremoved: "success"
spec:
  containers:
    - name: opa
      image: openpolicyagent/opa:0.9.2
      args:
        - "run"
        - "--server"
        - "--addr=localhost:8080"
      resources:
        limits:
          cpu: "100m"
          memory: "30Mi"

with the "mutationremoved" label being applied after the assign mutator was deleted (this emulates only-on-create), which yielded the following error:

$ kubectl apply -f opa2.yaml
The Pod "opa-post-mutation" is invalid: spec: Forbidden: pod updates may not change fields other than `spec.containers[*].image`, `spec.initContainers[*].image`, `spec.activeDeadlineSeconds` or `spec.tolerations` (only additions to existing tolerations)
  core.PodSpec{
        Volumes:        {{Name: "kube-api-access-4tnbr", VolumeSource: {Projected: &{Sources: {{ServiceAccountToken: &{ExpirationSeconds: 3607, Path: "token"}}, {ConfigMap: &{LocalObjectReference: {Name: "kube-root-ca.crt"}, Items: {{Key: "ca.crt", Path: "ca.crt"}}}}, {DownwardAPI: &{Items: {{Path: "namespace", FieldRef: &{APIVersion: "v1", FieldPath: "metadata.namespace"}}}}}}, DefaultMode: &420}}}},
        InitContainers: nil,
        Containers: []core.Container{
                {
                        ... // 6 identical fields
                        EnvFrom: nil,
                        Env:     nil,
                        Resources: core.ResourceRequirements{
                                Limits: core.ResourceList{
-                                       s"cpu":    {i: resource.int64Amount{value: 100, scale: -3}, s: "100m", Format: "DecimalSI"},
+                                       s"cpu":    {i: resource.int64Amount{value: 1}, s: "1", Format: "DecimalSI"},
                                        s"memory": {i: {...}, s: "30Mi", Format: "BinarySI"},
                                },
                                Requests: {s"cpu": {i: {...}, s: "100m", Format: "DecimalSI"}, s"memory": {i: {...}, s: "30Mi", Format: "BinarySI"}},
                        },
                        VolumeMounts:  {{Name: "kube-api-access-4tnbr", ReadOnly: true, MountPath: "/var/run/secrets/kubernetes.io/serviceaccount"}},
                        VolumeDevices: nil,
                        ... // 11 identical fields
                },
        },
        EphemeralContainers: nil,
        RestartPolicy:       "Always",
        ... // 25 identical fields
  }

This suggests that the only-on-create will need to be augmented a bit to avoid these kinds of issues. I don't know if we can handle problems that arise after a mutator has been deleted, but hopefully we can make newly-added/modified mutators more pre-existing-pod friendly. I want to dig into a server-side-apply cluster's behavior a bit before trying to figure out what the best fit is here.

maxsmythe avatar Aug 04 '22 01:08 maxsmythe

What do you think about something that says: "on CREATE, use the provided value, on UPDATE use the value from oldObject"?

maxsmythe avatar Aug 16 '22 01:08 maxsmythe

What do you think about something that says: "on CREATE, use the provided value, on UPDATE use the value from oldObject"?

Sounds reasonable, as long as it's not an issue that the target field & value in oldObject may not exist. Rather than configure the on update value, would it just be done automatically and toggled via a boolean that disables the mutation for an update. Example Assign.spec.parameters.assign.onUpdate = false enables the behaviour you described

ctml91 avatar Aug 16 '22 02:08 ctml91

Sounds reasonable, as long as it's not an issue that the target field & value in oldObject may not exist.

In the case of a missing field in oldObject, I'd expect the mutator to remove the field from current object, if it exists.

Rather than configure the on update value, would it just be done automatically and toggled via a boolean that disables the mutation for an update. Example Assign.spec.parameters.assign.onUpdate = false enables the behaviour you described

I haven't thought about the API design yet, but something similar to that. The specifics of how it'd be expressed (boolean vs. enum, should the field live next to value, etc.), will probably come from concerns like level of self-documenting, room left to add new functionality in the future as-needed, and level of cohesion with current design choices.

maxsmythe avatar Aug 17 '22 00:08 maxsmythe

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Oct 16 '22 18:10 stale[bot]

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Dec 17 '22 00:12 stale[bot]

/reopen

ctml91 avatar Dec 31 '22 02:12 ctml91

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Mar 02 '23 07:03 stale[bot]

active

ctml91 avatar Mar 02 '23 16:03 ctml91

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

stale[bot] avatar May 02 '23 10:05 stale[bot]

active

ctml91 avatar May 02 '23 12:05 ctml91

We're having similar issues, We would love to see a way for a mutation to work on CREATE only.

Sartigan avatar May 15 '23 16:05 Sartigan

Affecting us too. Personally I would consider this to be a high priority bug as the overwhelming majority of users who will deploy GK will do so in an existing cluster and apply mutators against namespaces with resources that already exist. A mutator targeting fields that are immutable for updates is pretty common (statefulsets, pods, probably others?) and they may be unaware that GK is the reason why some objects are not being garbage collected by KCM as the errors are buried in KCM logs.

I think either need a toggle that allows the user full control over the scope of the mutation verb (create, update, delete?) or using mutation tracking annotations to know if a pod as had mutations applied, though it might not necessarily provide enough detail or be current with the latest version of a mutator. I'd prefer giving the user more control so opt for option 1, or some variant of it that mimics it.

Even better would be CustomMutator API that allows users to write a mutator using a language like rego, javascript, etc which would alleviate many other limitations around the "strict" mutators (yes they are strict for good reason, but also limiting for many use cases too). I know there is the CEL KEP for native admission control that I think is implemented for 1.28, and another CEL KEP for mutation too but they would not allow for referential mutation policies nor external data which GK would still provide significant value for. Is a CustomMutator API something that could be considered? :D

ctrought avatar Aug 03 '23 19:08 ctrought

Affecting us too. Personally I would consider this to be a high priority bug as the overwhelming majority of users who will deploy GK will do so in an existing cluster and apply mutators against namespaces with resources that already exist.

Agree that this is worth doing prior to the CEL KEP, since this affects functionality that is already in-the-wild.

I think either need a toggle that allows the user full control over the scope of the mutation verb (create, update, delete?)

Looking further up the thread, it looks like limiting mutation to "only on create" could still cause problems if the mutator is overriding user-supplied intent:

https://github.com/open-policy-agent/gatekeeper/issues/2196#issuecomment-1204641841

A possible solution is proposed here:

https://github.com/open-policy-agent/gatekeeper/issues/2196#issuecomment-1216045434

they would not allow for referential mutation policies nor external data

We need to be very careful with how those are provided. Mutation webhooks are called in sequence, which makes them very latency sensitive. One reason mutation can provide external data is because we have strict enough syntax that we can paralelize those calls in a way that is logically consistent. Running arbitrary code would put us on an even performance footing with bespoke mutation webhooks (though there would be less infrastructure to run).

Is a CustomMutator API something that could be considered? :D

I'd definitely like to see what K8s comes up with for CEL-based mutation first, otherwise we can't really know how what they would implement would interact with what we would implement. It would also make it easier to know where, specifically, we can provide value.

@acpana are you interested in adding a field/functionality that works as described in https://github.com/open-policy-agent/gatekeeper/issues/2196#issuecomment-1216045434 ? Probably necessary for Assign, AssignImage and ModifySet. I'm not aware of metadata being similarly constraint anywhere. We should make it clear to users that using this flag will basically make the fields read-only (which is fine, since read-only fields are the reason for this request).

maxsmythe avatar Aug 05 '23 01:08 maxsmythe

Thanks for the tag Max! I'll assign it to myself when I have cycles.

acpana avatar Aug 10 '23 17:08 acpana

We got impacted due to this mutation issue where a few set of pods stuck in terminating state until we manually set the policies.io/disable-mutate label to true.

sandeepgsgit avatar Sep 07 '23 19:09 sandeepgsgit

Also affected. This also breaks other mutating webhooks which run after gatekeeper. Even if the other webhooks only mutate allowed fields.

jan-kantert avatar Oct 24 '23 10:10 jan-kantert

We found a workaround for this. You can set the following in your helm chart to limit the webhook to CREATE. This will obviously break mutating webhooks on UPDATE. However, it you do not need those this will work fine:

    mutatingWebhookCustomRules:
      - apiGroups:
        - '*'
        apiVersions:
        - '*'
        operations:
        - CREATE
        resources:
        - '*'

jan-kantert avatar Oct 30 '23 13:10 jan-kantert

Bumping this -- would like to see this implemented. The failure mode this creates doesn't happen often, but when it does it's a PITA to get fixed.

skaven81 avatar Mar 12 '24 15:03 skaven81

BTW the workaround noted above (adjust the mutating webhook operations list) is fine, if all of your mutators should only be active on create operations. But what if we have a mix of mutators where some should happen on create+update, others on create only?

skaven81 avatar Mar 12 '24 15:03 skaven81