gateway-api
gateway-api copied to clipboard
discussion: find a simple way to figure which policies are affecting a resource
What would you like to be added:
A simple way to figure out which policies are affecting a resource.
Why this is needed:
As a maintainer of Gateway, I am often asked about the configuration of a route. In the old days, I could figure out it simply by checking the spec of APISIXRoute or VirtualService, and their ancestor. However, in Gateway API, many features are dispatched via policy attachment, not the Route resource itself.
There are other ways to figure out, though. If we create the policies by using RESTful API, we can keep the reverse index of targetRef in the database. If we create the policies via IaC, we can keep the policies as the same yaml as the route. To keep thing simple, I don't mention the inherit policies, but they can be found by doing the same thing with the route's ancestor.
However, if we want to have a generate way to solve this problem (maybe only depends on kubectl), AFAIK there is no way to do it.
I find https://gateway-api.sigs.k8s.io/geps/gep-713/#standard-status-condition-on-policy-affected-objects is close to this topic, but it only defines a boolean flag PolicyAffected. As the type name has to be unique, and there would be multiple policies affect a resource.
One premature solution comes to my mind is that we can add the annotation to a resource, which stores all the group-kind-name of attached policies.
@spacewander, support for PolicyAffected status condition as described in GEP-713 is still provisional and implementations are not yet bound to it. However, I do see increasing adoption of the pattern – e.g, by Kuadrant and Nginx Gateway Fabric.
I don't know if a single <GatewayController>/<PolicyKind>PolicyAffected condition per gateway controller name and policy kind would be enough for you or if you're looking for something more granular. With the currently proposed status condition, details such as the name of the policy resource would go in the status message.
Another existing option is describing the resource using gwctl.
The problem with most approaches that involve putting a direct reference to all the attached policies relevant to an object is that you're creating a fan-out or fan-in update problem for your implementation or the API server.
That is, you're creating a situation where an update to one object (the Policy) creates multiple further updates to the status of multiple objects (every object the Policy affects).
This is fine when you have a small number, but adds a big complexity factor to keeping the system up to date.
The annotation solution you describe is the same here - we did look at something similar earlier, but it means that you could concievably have one object update (on the Policy) generating many other updates (updating every object the Policy targets).
This is even worse now because you have multiple targetRefs available in a single Policy (for good reason), but this creates a vector for either inadvertent or malicious increasing of API server load. The worst case scenario here is a malicious user performing a denial-of-service attack on the API server by creating a single Policy that targets many objects, and making it rapidly valid and invalid. Each time the Policy object becomes valid, the implementation must update every affected object with the complete set of annotations, and every time if becomes invalid, the same applies.
This is why we've been focussing on building tooling and libraries into gwctl for doing this lookup for you on the client side instead, where we can more safely assume user-scale timelines (delaying queries for a second or two is okay there because it's more likely to be a user or user-adjacent process).
This is why we've been focussing on building tooling and libraries into gwctl for doing this lookup for you on the client side instead, where we can more safely assume user-scale timelines (delaying queries for a second or two is okay there because it's more likely to be a user or user-adjacent process).
Agree with this.
/triage needs-information
The Kubernetes project currently lacks enough contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle stale - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.
This bot triages un-triaged issues according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the issue is closed
You can:
- Mark this issue as fresh with
/remove-lifecycle rotten - Close this issue with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten