gateway-api icon indicating copy to clipboard operation
gateway-api copied to clipboard

Clarify BackendTLSPolicy can be used with all kinds of Services

Open guicassolato opened this issue 3 months ago • 12 comments

What would you like to be added:

Clarify in the docs that the BackendTLSPolicy kind can be used to configure TLS for the connection to any Service, not only backend services that sit behind an xRoute.

Why this is needed:

The docs for BackendTLSPolicy make the use case that the policy kind is primarily thought for very clear: configuring TLS connection "from the Gateway to a backend [that is listed in the backendRefs of the xRoute rules]" (my addition).

However, at a glance, it may not be clear that the policy kind can also be used to configure TLS for any Gateway-to-Service connection, such as TLS for external authorization requests, whose docs already states:

If the backend service requires TLS, use BackendTLSPolicy to tell the implementation to supply the TLS details to be used to connect to that backend.

Clarifying such point will signal to the implementations about the full scope of BackendTLSPolicy, avoiding any possible confusion of the policy otherwise being limited to xRoutes backendRefs.

guicassolato avatar Sep 08 '25 14:09 guicassolato

Based on my opinion and discussion with the recent changes IMO it should apply to all of them like you said. Being explicit would be a good enhancement to the docs imo

howardjohn avatar Sep 08 '25 14:09 howardjohn

@guicassolato How does traffic move through a Gateway-to-Service without using an xRoute? I realize the service mesh case moves traffic without using a Gateway, but you're talking about a different case here. What is the doc reference for:

If the backend service requires TLS, use BackendTLSPolicy to tell the implementation to supply the TLS details to be used to connect to that backend.

Overall, I think external auth deserves its own policy. There doesn't seem to be any use case for BackendTLSPolicy to be repurposed for something like external auth policy.

Is this what you're referencing? https://docs.solo.io/gloo-mesh-enterprise/latest/reference/api/ext_auth_policy/

candita avatar Sep 11 '25 02:09 candita

@candita I think the point is the backends like https://github.com/kubernetes-sigs/gateway-api/blob/58c1466f361b664fafdc39ae54588d904e02a04b/apis/v1/httproute_types.go#L1620 that is a feature of HTTPRoute, but not the real backendRef from the route.

Probably the clarification needed here is that the BTLS applies to any service selected at any moment by the Gateway when starting a communication, and not only by backends selected for a specific *Route rule match.

The way the BTLS will select tho may change in the future: https://github.com/kubernetes-sigs/gateway-api/pull/3876

rikatz avatar Sep 11 '25 13:09 rikatz

Hi @candita.

@rikatz's right. Backends behind HTTP filters are one use case.

But overall I would like to clarify that – at least in my understanding –, if you target a Service with a BackendTLSPolicy, the intent is clear: traffic hitting that service must be encrypted, period. That is: independently of existence of a route between Gateway and Service.

There doesn't seem to be any use case for BackendTLSPolicy to be repurposed for something like external auth policy.

Sorry if I wasn't clear before. BackendTLSPolicy repurposed as external auth is not what I meant.

The way I see it, BackendTLSPolicy means "enable TLS on [all] traffic to this Service". This includes:

  • connections between gateways and the service, because the service is the backend (upstream) for a route rule; as well as
  • connections between gateways and the service, because the service is the backend for a route filter (i.e., not the ultimate goal of the outer request being handled by the gateway);
  • connections from other services to the service;
  • connections from any gateway or proxy to the service, due to the existence of any implementation-specific policy that causes the gateway to call that service even if a route linking them does not exist – e.g., a RateLimitPolicy (for which, as of today, there's no Gateway API kind of filter defined), an ExternalAuthPolicy that hypothetically an implementation introduces to enable targeting an entire Gateway (unlike the rather limited option defined in Gateway API today to do that via HTTPRoute filter only.)

guicassolato avatar Sep 15 '25 08:09 guicassolato

@guicassolato, you're not wrong that there's an implication there, but due to substantial crossover with service mesh use cases, we specifically removed language like that from the GEP. There just wasn't enough agreement on exactly how a BackendTLSPolicy would work in a more generic context to be able to move forward. So we concentrated on the more limited use case to meet that need.

Can we talk more about this once v1.4 is out and the current use case is well understood? Totally.

youngnick avatar Sep 18 '25 01:09 youngnick

To be clear, by not unequivocally stating whether a BackendTLSPolicy is for configuring TLS only for services referred in the backendRefs fields of Gateway API resources or any service at all, do we expect one of these two options to be tacitly implied or are we intentionally keeping it open to interpretation, until better definition lands?

IOW, as an implementation, if I decide right now that Gateway API BackendTLSPolicy resources can be used in my implementation to configure TLS on services in general, is that acceptable or frowned upon?

guicassolato avatar Sep 19 '25 08:09 guicassolato

At the moment, the only defined behavior is when a Service is also a backendRef. Any other behavior is undefined, and, while an implementation can make a guess about what we will end up doing, that implementation is also risking that it might be guessing wrong and have to change things later.

In any case, I don't think that Gateway API can unilaterally make the call that BackendTLSPolicy must always be enforced in non-Gateway API contexts. Of course, an implementation can choose to use BackendTLSPolicy for that, in the same way that some other parts of Kubernetes have chosen to use ReferenceGrant for non-Gateway API use cases. But Gateway API can't really make rules to cover that, it's out of our remit.

youngnick avatar Sep 23 '25 00:09 youngnick

Sounds fair, @youngnick.

I would like to keep this issue open for now as a suggestion to review the current spec with a "MAY be used in non-Gateway API contexts" perhaps, if that's OK.

Thanks!

guicassolato avatar Sep 23 '25 07:09 guicassolato

This has implications for the inference extension Gateway -> Endpoint Picker connection too, see https://github.com/kubernetes-sigs/gateway-api-inference-extension/issues/582

/cc @LiorLieberman

mikemorris avatar Oct 08 '25 19:10 mikemorris

I would like to unblock this subject with the proposal of extending BackendTLSPolicy specification as:

"Implementations MAY use BackendTLSPolicy to define how a proxy will communicate with backends that are not related to a xRoute"

We can make it in a way this is Implementation Specific, still allowing cases like:

  • On Mesh implementations, Producer can define how a consumer should communicate with their backends
  • On Mesh implementations, Consumers can define how they should communicate with a defined backend
  • On Gateway Implementation, the cluster admin can define how proxy should reach to external auth or external proc (eg.: endpoint picker, ext_auth, some WAF proposal, etc).

This gives the opportunity of reusing the API to define behavior of communication without deviating the current specification of Backend TLS Policy. Maybe it deserves a GEP update (or a memorandum GEP if we define that this is not an API change, but a behavior acceptance)

Edit: BTW, we will need another update to allow TLSRoute re-encryption, saying that BackendTLSPolicy can be used by TLSRoute now, so maybe we can put it all on a similar bucket of updating the BTLSPolicy behavior

rikatz avatar Oct 20 '25 18:10 rikatz

Now that we're past the 1.4 release, @guicassolato @rikatz would either of y'all consider opening a PR with proposed language here?

I'd very much like to get this clarified (and I'm not sure if a MAY/implemenation-specific recommendation is sufficient) due to the corresponding broader implications of implicit "global" applicability for Service-targeted policies as discussed in https://github.com/kubernetes-sigs/gateway-api/pull/3876#discussion_r2159520328 (due to the lack of a "from" scoping mechanism).

mikemorris avatar Dec 11 '25 21:12 mikemorris

yes, I think we should Mike.

I am not sure if the right path here is to update BackendTLSPolicy GEP, or create a new one extending its usage to:

  • TLSRoute (GEP is merged now!)
  • Others services

For TLSRoute IMO it should be an extended support behavior, that should live together with the Termination feature. I will start writing something here and let you know

rikatz avatar Dec 15 '25 20:12 rikatz