gateway-api Policy Attachment Should Support a TargetRef with Gateway Class Name(s)

What would you like to be added: A way for policies to apply only to specific GatewayClasses. This would not be required, but it would be a field that policies could choose to include in their targetRef and would be recognized + supported by tools like gwctl when it was present.

For example, we might have a targetRef like this:

targetRefs:
- kind: Service
  name: foo
  gatewayClassName: bar

This would mean that this specific policy attached to Service foo would only be implemented by Gateways of bar class, and would be ignored by other Gateways.

Why this is needed: We're seeing the rise of multi-implementation policies used with Gateway API - policies that are intended to be implemented by multiple GatewayClasses such as BackendTLSPolicy, BackendLBPolicy, and even some vendor-specific ones where the vendor actually has multiple implementations.

Jun 25 '24 17:06 robscott

Can you please change "ClassName(s)" to "gateway class name(s)"?

Jul 08 '24 22:07 candita

+1 adding this will make the attachment 1:1, and will allow implementations to safely add a ReasonTargetNotFound reason in the status in case a policy cannot attach to the target relates to https://github.com/envoyproxy/gateway/issues/2724

Jul 08 '24 23:07 arkodg

rethinking this one, the issue is tied to the implementation's inability to decide whether its meant to reconcile a policy or not. It doesnt have this problem for other gateway-api resources since they all link back to a gatewayClass parent which links to a controller/implementation. So would it be better to add a way to link back to the controller in the policy spec itself ?

Jul 09 '24 00:07 arkodg

I'm torn because on one hand, this feels more like a policy all on its own (almost ReferenceGrantish?). The problem with that option is that it would require yet another optional resource that needs to be created. 🤔 Maybe it's the least bad option?

Jul 09 '24 13:07 keithmattix

With Gateway API policies like BackendTLSPolicy and BackendLBPolicy, another use case would be for cluster admins to be empowered to create "global" policies that would be applied to specific GatewayClasses and their children.

On the other hand, other policies, like Kuadrant's AuthPolicy and RateLimitPolicy, can already target specific HTTPRoutes, or a set of listeners in a gateway. Why stop at GatewayClasses as targets? Check out Kuadrant's policy machinery. It uses a "Topology struct for modeling topologies of targetable network resources and corresponding attached policies". 😎

cc @maleck13 @guicassolato

Jul 11 '24 18:07 candita

I'm realizing my issue description did not actually clearly convey what I meant. I actually meant a way for targetRefs to include an optional gatewayClassName field, for example:

targetRefs:
- kind: Service
  name: foo
  gatewayClassName: bar

This would mean that this specific policy attached to Service foo would only be implemented by Gateways of bar class, and would be ignored by other Gateways. Will rework my original description to make that intent clearer.

Jul 11 '24 18:07 robscott

I think I kinda agree with @arkodg's line of thinking here but feel like we may be conflating two somewhat-related concerns:

Should the controller for my implementation pay attention to this policy?
- Currently, if a controller understands/is watching for a policy, it is generally assumed it should be respected. This seems mostly fine for implementation-specific policies, but centrally-defined policies like BackendTLSPolicy used by multiple implementations can make this more complicated.
For which "clients" should a policy targeting some resource lacking a clear ancestry ownership line up to a given controller (such as Service, which can't be traced back the way HTTPRoute -> parentRef Gateway -> gatewayClassName GatewayClass -> controllerName can be) be applied?
- I believe this is related to the concerns in #2755 @christianang @sunjayBhatia, and it appears @kate-osborn is working on similar functionality over in NGINX Gateway Fabric with nginxinc/nginx-gateway-fabric#1940

1️⃣ (and by extension scoping an existing targetRef to gatewayClassName) is somewhat of a coarse way to achieve 2️⃣ , by implying that all Gateway clients of a given controller (or GatewayClass) should respect a policy. While this could be useful, I'm not sure if this is actually sufficiently granular, as it may be inadequate for some use cases where different Gateway clients of the same GatewayClass or controller should have different configuration, policy or priority when addressing a given backend (for example), and it's definitely inadequate for other cases of multiple-selection policy, such as AuthZ policy, where a service owner would want to configure specific authorization policies for different clients.

I'm more inclined to head in the direction @candita is suggesting, and think if we could design a secondary targeting structure more like Kuadrant's topology or what would be needed for the AuthZ policy case (like the multiple from clauses in Istio's AuthorizationPolicy as an example, scoping which clients are allowed to connect to backends associated with a given namespace/pod selector/targetRef), we would also be able to nicely solve the Gateway client policy selection in a more granular way.

Jul 19 '24 17:07 mikemorris

Solution1

targetRefs:
- kind: HTTPRoute
  name: route1
  gatewayClassName: class1

Solution2

gatewayClassName: class1
targetRefs:
- kind: HTTPRoute
  name: route1

Although both options solve the multiple problems outlined in this issue, imo solution2 creates a stronger parent child link between a gatewayClass and policy

Jul 19 '24 19:07 arkodg

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Oct 17 '24 20:10 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Nov 16 '24 20:11 k8s-triage-robot

/remove-lifecycle rotten

Nov 19 '24 07:11 maleck13

@guicassolato was this issue discussed in person at kubecon ?

Nov 19 '24 18:11 arkodg

Bumping this in the context of https://github.com/istio/istio/issues/53991 - would the proposal in this issue conflict (or simply be confusing) with policy attachment patterns targeting a GatewayClass directly, which GEP-2649 seems to allow with the caveat that it's "tricky", might require a cluster-scoped policy resource (and thus likely only be usable for the cluster operator Chihiro persona) and that a GatewayClass paramsRef resource field may be more straightforward for this behavior?

Would our use case in Istio (cluster-wide default-deny AuthZ policy) be more appropriate for introducing and targeting the Mesh resource we've discussed for GAMMA but has sat in limbo without a clear motivating need?

/cc @ilrudie

Nov 20 '24 22:11 mikemorris

We did discuss this in person, and I recorded my concerns about combinatorial complexity.

Personally, I feel like the pattern of making things broadly target some class of other object then having a field that says "oh, only the ones that roll up to this GatewayClass" smells funny.

I think for use cases like @mikemorris talks about above, it's way better to anchor the Policy to something that fits into an existing hierarchy.

Every dimension we add to Policy targeting produces an exponential effect on the overall complexity.

Already, it's possible to have a single Policy that targets multiple objects in multiple namespaces, and folks are also asking for label selectors - if we add this on top, then you can have a single policy that targets a label selector, but only for objects that roll up to a single GatewayClass. That's rapidly approaching the level of spooky-action-at-a-distance that is completely impenetrable to Ana the end user.

Nov 21 '24 03:11 youngnick

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Feb 19 '25 03:02 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

Mar 21 '25 04:03 k8s-triage-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen
Mark this issue as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Apr 20 '25 05:04 k8s-triage-robot

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue with /reopen

Mark this issue as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Apr 20 '25 05:04 k8s-ci-robot