compliantkubernetes-apps icon indicating copy to clipboard operation
compliantkubernetes-apps copied to clipboard

[0] Change the NetworkPolicy Gatekeeper constraint for delete actions

Open linus-astrom opened this issue 1 year ago • 1 comments

Description

Change this Gatekeeper constraint to no longer activate on delete actions. As it could otherwise make the Deployments and other Resources be stuck in pending deletion if the associated NetworkPolicy gets deleted before the Deployment is or if both Resources get deleted at the same time depending on the finalizer.

Additional context

No response

Definition of done

  • [ ] The Gatekeeper constraint is changed.
  • [ ] Test that Resources can be deleted without becoming stuck by this Gatekeeper constraint with opa.networkPolicies.enforcement set to deny.

linus-astrom avatar Apr 05 '24 12:04 linus-astrom

Only that particular constraint? Found a chart setting but I believe it applies widely.

https://github.com/elastisys/compliantkubernetes-apps/blob/7de50bb898d0bf647b9bad2e0bc02c77eb3785c3/helmfile.d/upstream/open-policy-agent-gatekeeper/gatekeeper/README.md?plain=1#L131

Zash avatar Apr 23 '24 14:04 Zash

Only that particular constraint? Found a chart setting but I believe it applies widely.

https://github.com/elastisys/compliantkubernetes-apps/blob/7de50bb898d0bf647b9bad2e0bc02c77eb3785c3/helmfile.d/upstream/open-policy-agent-gatekeeper/gatekeeper/README.md?plain=1#L131

Yes, I would like it to be only the specified constraint as its the one causing problems.

linus-astrom avatar Apr 24 '24 06:04 linus-astrom

Unable to reproduce this.

Per the validating webhook configuration, this policy should not be involved in deletions at all.

https://github.com/elastisys/compliantkubernetes-apps/blob/f6c941b56f87a73594dac8780d5bc14bb67ec16e/helmfile.d/values/gatekeeper/gatekeeper.yaml.gotmpl#L23-L25

Zash avatar May 21 '24 12:05 Zash

We have found that the Gatekeeper update operation can be used as delete operation under specific circumstances.

For example with the NetworkPolicy constraint, If its set to deny and you have a Deployment and a NetworkPolicy for that Deployment and you delete the NetworkPolicies, then you can normally delete the Deployment afterwards.

But if you have the same situation and the only change is the Deployment uses a finalizer. Then after you delete the NetworkPolicy, you will not be able to fully delete the Deployment and it will be stuck in pending deletion since the Kubernetes API action for finalizers would edit the key .metadata.deletionTimestamp or if the delete is fast enough it will be trying to empty out its metadata.finalizers field. Both actions would be prevented since the update operation can only be done when a Deployment has a NetworkPolicy which was deleted in this case.

Making the update operation act as a delete operation for this Gatekeeper constraint.

linus-astrom avatar May 24 '24 12:05 linus-astrom