gateway-api Provide a way to configure TLS from a Gateway to Backends

Provide a way to configure TLS from a Gateway to Backends

Open robscott opened this issue 2 years ago • 7 comments

What would you like to be added: The ability to configure TLS connections between a Gateway and Backends (also commonly referred to as upstream TLS or re-encryption). At a minimum, this should include the ability to select this protocol for these connections. Additionally it would likely be helpful to be able to configure CA bundle and SAN(s) to be used for validation.

Why this is needed: Previously discussed in #968, this is a widely desired feature that is currently missing from the API.

Note: This kind of feature will require a GEP in the future. This issue is initially intended just to track discussion around both interest level and potential approaches for this potential addition to the API.

Mar 24 '22 04:03 robscott

I need your help. I would like to work on this. May I? Thanks!

Mar 26 '22 05:03 mehabhalodiya

I think that this issue is waiting on feedback, use cases and so on before we generate a GEP and get started on a design. I think that something that could be very helpful is to review what some specific CRDs do (Istio, Contour, Gloo, and Openshift all have CRDs that I think let you specify this). That way we can be confident we've checked what the community is doing and be hopeful that a design will be useful for everyone.

Mar 27 '22 23:03 youngnick

Writing down some thoughts for the future: TLS between the gateway and the upstream service is about encryption and identity. This includes a common trust (CA), SANs for the upstream services, and a client certificate in the gateway (in case the upstream wants to verify the identity of the gateway). The challenge here is to figure out the ownership of the above information:

Trust stores are generally (not always) defined at a level above gateway or service. It is not recommended and rare to have a trust-store that is specific to a service. ReferencePolicy at the Gateway level seems to be a good fit here.
The same could be said about the client certificate of the Gateway, although I'm not sure. A gateway could have multiple identities to different services depending on the service it is talking to. Should we optimize for the common case here?
The SAN to use for validating upstream service is a tricky one. Does the upstream service owner define those? Or is that something the cluster-admin or service admin defines? I can see use-cases either way. Service admin and route owner can be the same or different person(s).

Apr 01 '22 23:04 hbagdi

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jun 30 '22 23:06 k8s-triage-robot

/remove-lifecycle stale

Jul 05 '22 01:07 youngnick

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Oct 03 '22 02:10 k8s-triage-robot

/remove-lifecycle stale

@candita has started work on supporting this functionality in https://github.com/kubernetes-sigs/gateway-api/pull/1430, refs discussion in https://github.com/kubernetes-sigs/gateway-api/discussions/1285 too

Oct 04 '22 19:10 mikemorris

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

Jan 02 '23 19:01 k8s-triage-robot

/lifecycle frozen

Jan 02 '23 20:01 enj

@candita is currently working on this (thank you Candace!)

/assign @candita

Mar 15 '23 16:03 shaneutt

@shaneutt: GitHub didn't allow me to assign the following users: candita.

Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide

In response to this:

@candita is currently working on this (thank you Candace!)

/assign @candita

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mar 15 '23 16:03 k8s-ci-robot

@shaneutt Adding my arguments from #622 here as well, since it was closed as duplicate.

Use case:

Application developer, working in a namespace wants to secure the traffic between proxy and their backend service.

They configure server certificate to their service and want the gateway to do server authentication according to that.
Symmetrically, application developer wants to enable client authentication. This will protect the backend service even further, by allowing incoming TLS connections only from the proxy. For the application developer to enable that, they would require capability to set client certificate per backend service, not per Gateway which would be outside of application developer's domain.

Implementing this enhancement has following advantages:

Allows "self-service" workflow for the application developer, since they do not need to depend on cluster admin to configure client certificate on gateway level
Cluster admin does not need to manage / coordinate / distribute single CA certificate for the development teams working in the cluster, in separate namespaces. Each team can configure their own client certificates for the proxy and CA certificates to their backend for validating the client cert.

I know I'm responding to old thread but wanted to add:

@hbagdi wrote

Trust stores are generally (not always) defined at a level above gateway or service. It is not recommended and rare to have a trust-store that is specific to a service. ReferencePolicy at the Gateway level seems to be a good fit here.

The same could be said about the client certificate of the Gateway, although I'm not sure. A gateway could have multiple identities to different services depending on the service it is talking to. Should we optimize for the common case here?

I've had experience from my organization that people ask per service configuration like in the use case I wrote above. It has proven to be complicated to coordinate the credentials on the cluster level (which we have in Contour).

Application developers likely prefer self-service since configuring (mutually authenticated) TLS for the gateway -> service hop is closely related to TLS configuration of their own backend service. They will be the ones to see any potential problems first. They can troubleshoot it best.

This would imply that gateway would need (a) per service trusted CA certificate to validate the server certificate of the backend server, and (b) per service client certificate for authenticating towards the backend service.

The SAN to use for validating upstream service is a tricky one. Does the upstream service owner define those? Or is that something the cluster-admin or service admin defines? I can see use-cases either way. Service admin and route owner can be the same or different person(s).

Assuming the upstream service has a hostname inside the cluster (the name of the Service), then typical hostname validation can be done according to the spec RFC 9110 (or bit more clearly expressed by old RFC 2818).

Mar 19 '23 06:03 tsaarni

Hate to add more noise and scope creeping to this, but one note is K8s core just added ClusterTrustBundles which are, afaik, designed to solve problems like this. However, its only alpha in 1.27 so a ways off.

Mar 20 '23 14:03 howardjohn

@robscott @shaneutt @youngnick I am proposing a GEP in https://github.com/kubernetes-sigs/gateway-api/issues/1897

Apr 03 '23 14:04 candita

Closing in favor of #1897, thanks @candita!

Apr 03 '23 16:04 robscott

/close

Apr 03 '23 16:04 robscott

@robscott: Closing this issue.

In response to this:

/close

Apr 03 '23 16:04 k8s-ci-robot

gateway-api gateway-api copied to clipboard

Provide a way to configure TLS from a Gateway to Backends

gateway-api
gateway-api copied to clipboard