gatekeeper
gatekeeper copied to clipboard
Add an operation selector to constraints
I have these changes more or less ready to go but I was unsure about the best way to integrate with auditing, so I would appreciate any feedback on that.
Describe the solution you'd like Add an operation selector to constraints, like:
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
name: namespace-delete-allowed
spec:
match:
operations:
- DELETE
kinds:
- apiGroups: [""]
kinds: ["Namespace"]
parameters:
message: "Namespaces must be labelled with `delete-allowed` to be deleted"
labels:
- key: "delete-allowed"
allowedRegex: "^true$"
This would address a couple of problems I've encountered:
- If I want to match a specific operation I have to do it in the template rego, which is frustrating if I am performing matching which could be covered by a generally applicable template (k8srequirelabel, for instance)
- If I'm forwarding DELETE operations to the webhook then I have to modify a violating resource to make it conform to the policy before I can remove it. This can be confusing for end users who might not understand why setting a certain field is required to remove an object. It might also be impossible if the fields you're checking are immutable. Again, I have to explicitly edit the rego to work around this.
Anything else you would like to add: Where this gets a little tricky is the audit, which doesn't set an operation and where I don't think you could reliably infer from the list of operation selectors whether it makes sense for the constraint to produce a violation or not.
For instance, in the example you can see we have a constraint that ensures Namespaces have a particular 'delete-allowed' label on deletion to protect against users carelessly removing the Namespace. The fact that the label exists on the Namespace before it is deleted is not necessarily a violation, it's expected.
I think you could run into the same sort of problems if you're checking the user who submitted a request, or any other data that isn't applicable at audit time.
My initial thought was that another field could be added, something like exemptFromAudit
, which could be set to true
to not match during the audit.
My initial thought was that another field could be added, something like exemptFromAudit, which could be set to true to not match during the audit.
Perhaps a new enforcementAction
, something like denyonly
?
Including the operation in the match criteria would be a significant departure from the current usage of the match
field, as currently we are only matching against properties of the object itself, not properties of the request. Part of the reason for limiting this scope is to avoid the quandary that you're describing: "what do we do when there is insufficient data for one of the enforcement points?".
I like the idea of extending enforcementAction
a bit more. My thoughts here are that it may be worth being able to configure a constraint on a per-enforcement-point basis. Something like:
enforcementPoints:
- type: "admission.k8s.io"
enforcementAction: "deny"
requestOperations: ["DELETE"]
- type: "audit.gatekeeper.sh"
enforcementAction: "deny"
The idea here would be that depending on the enforcement point, different parameters could be provided that reflect the different capabilities of different enforcement points. For example, if audit supported paging admins directly but the validation webhook did not, then audit could be configured with which admins should be paged.
The above is a very rough draft, further requirements gathering would be needed to get something more future-proof.
It seems something like the above would also work for your use-case. I'm curious though: what benefit are you getting by requiring a label to be set before deletion? If a user has permission to CRUD an object, they can add the label manually and immediately delete.
I'm curious though: what benefit are you getting by requiring a label to be set before deletion?
This is mostly to protect ourselves from ourselves. We had an experience where a cluster admin deleted the wrong namespace, which obviously cascades down to all the objects in the namespace. By requiring a label be set before deletion we get some assurance that whoever is issuing the delete has thought about it.
As a bit of a related aside, what we've ended up doing to insulate other resource types from being needlessly evaluated for DELETE
operations is to define a separate webhook just for DELETE
:
- name: delete.gatekeeper.sh
admissionReviewVersions:
- v1
- v1beta1
clientConfig:
service:
name: gatekeeper-webhook-service
namespace: sys-gatekeeper
path: /v1/admit
failurePolicy: Ignore
matchPolicy: Exact
namespaceSelector:
matchExpressions:
- key: admission.gatekeeper.sh/ignore
operator: DoesNotExist
rules:
- apiGroups:
- ""
apiVersions:
- "v1"
operations:
- DELETE
resources:
- "namespaces"
sideEffects: None
timeoutSeconds: 3
This works for us because namespaces are the only objects we want to review on DELETE
.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
still relevant discussion