gateway-api
gateway-api copied to clipboard
Provide a way to configure TLS from a Gateway to Backends
What would you like to be added: The ability to configure TLS connections between a Gateway and Backends (also commonly referred to as upstream TLS or re-encryption). At a minimum, this should include the ability to select this protocol for these connections. Additionally it would likely be helpful to be able to configure CA bundle and SAN(s) to be used for validation.
Why this is needed: Previously discussed in #968, this is a widely desired feature that is currently missing from the API.
Note: This kind of feature will require a GEP in the future. This issue is initially intended just to track discussion around both interest level and potential approaches for this potential addition to the API.
I need your help. I would like to work on this. May I? Thanks!
I think that this issue is waiting on feedback, use cases and so on before we generate a GEP and get started on a design. I think that something that could be very helpful is to review what some specific CRDs do (Istio, Contour, Gloo, and Openshift all have CRDs that I think let you specify this). That way we can be confident we've checked what the community is doing and be hopeful that a design will be useful for everyone.
Writing down some thoughts for the future: TLS between the gateway and the upstream service is about encryption and identity. This includes a common trust (CA), SANs for the upstream services, and a client certificate in the gateway (in case the upstream wants to verify the identity of the gateway). The challenge here is to figure out the ownership of the above information:
- Trust stores are generally (not always) defined at a level above gateway or service. It is not recommended and rare to have a trust-store that is specific to a service. ReferencePolicy at the Gateway level seems to be a good fit here.
- The same could be said about the client certificate of the Gateway, although I'm not sure. A gateway could have multiple identities to different services depending on the service it is talking to. Should we optimize for the common case here?
- The SAN to use for validating upstream service is a tricky one. Does the upstream service owner define those? Or is that something the cluster-admin or service admin defines? I can see use-cases either way. Service admin and route owner can be the same or different person(s).
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/remove-lifecycle stale
@candita has started work on supporting this functionality in https://github.com/kubernetes-sigs/gateway-api/pull/1430, refs discussion in https://github.com/kubernetes-sigs/gateway-api/discussions/1285 too
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.
This bot triages issues and PRs according to the following rules:
- After 90d of inactivity,
lifecycle/stale
is applied - After 30d of inactivity since
lifecycle/stale
was applied,lifecycle/rotten
is applied - After 30d of inactivity since
lifecycle/rotten
was applied, the issue is closed
You can:
- Mark this issue or PR as fresh with
/remove-lifecycle stale
- Mark this issue or PR as rotten with
/lifecycle rotten
- Close this issue or PR with
/close
- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
/lifecycle frozen
@candita is currently working on this (thank you Candace!)
/assign @candita
@shaneutt: GitHub didn't allow me to assign the following users: candita.
Note that only kubernetes-sigs members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time. For more information please see the contributor guide
In response to this:
@candita is currently working on this (thank you Candace!)
/assign @candita
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
@shaneutt Adding my arguments from #622 here as well, since it was closed as duplicate.
Use case:
Application developer, working in a namespace wants to secure the traffic between proxy and their backend service.
- They configure server certificate to their service and want the gateway to do server authentication according to that.
- Symmetrically, application developer wants to enable client authentication. This will protect the backend service even further, by allowing incoming TLS connections only from the proxy. For the application developer to enable that, they would require capability to set client certificate per backend service, not per Gateway which would be outside of application developer's domain.
Implementing this enhancement has following advantages:
- Allows "self-service" workflow for the application developer, since they do not need to depend on cluster admin to configure client certificate on gateway level
- Cluster admin does not need to manage / coordinate / distribute single CA certificate for the development teams working in the cluster, in separate namespaces. Each team can configure their own client certificates for the proxy and CA certificates to their backend for validating the client cert.
I know I'm responding to old thread but wanted to add:
@hbagdi wrote
- Trust stores are generally (not always) defined at a level above gateway or service. It is not recommended and rare to have a trust-store that is specific to a service. ReferencePolicy at the Gateway level seems to be a good fit here.
- The same could be said about the client certificate of the Gateway, although I'm not sure. A gateway could have multiple identities to different services depending on the service it is talking to. Should we optimize for the common case here?
I've had experience from my organization that people ask per service configuration like in the use case I wrote above. It has proven to be complicated to coordinate the credentials on the cluster level (which we have in Contour).
Application developers likely prefer self-service since configuring (mutually authenticated) TLS for the gateway -> service hop is closely related to TLS configuration of their own backend service. They will be the ones to see any potential problems first. They can troubleshoot it best.
This would imply that gateway would need (a) per service trusted CA certificate to validate the server certificate of the backend server, and (b) per service client certificate for authenticating towards the backend service.
- The SAN to use for validating upstream service is a tricky one. Does the upstream service owner define those? Or is that something the cluster-admin or service admin defines? I can see use-cases either way. Service admin and route owner can be the same or different person(s).
Assuming the upstream service has a hostname inside the cluster (the name of the Service
), then typical hostname validation can be done according to the spec RFC 9110 (or bit more clearly expressed by old RFC 2818).
Hate to add more noise and scope creeping to this, but one note is K8s core just added ClusterTrustBundles which are, afaik, designed to solve problems like this. However, its only alpha in 1.27 so a ways off.
@robscott @shaneutt @youngnick I am proposing a GEP in https://github.com/kubernetes-sigs/gateway-api/issues/1897
Closing in favor of #1897, thanks @candita!
/close
@robscott: Closing this issue.
In response to this:
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.