gateway-api
gateway-api copied to clipboard
[kind/api-change] Allow wildcard suffix in hostnames
What type of PR is this? /kind gep
What this PR does / why we need it: It adds support for wildcard suffixes in addition to wildcard prefixes in the Hostname type. It is fully backwards-compatible.
Which issue(s) this PR fixes:
Fixes https://github.com/kubernetes-sigs/gateway-api/issues/3643
Does this PR introduce a user-facing change?:
Added support for wildcard suffixes in hostnames (like example.*)
The committers listed above are authorized under a signed CLA.
- :white_check_mark: login: simonfelding (37a6a9312bd045475c7a96993f3f2d60984781b5, 0f4be7c331b68242b0ed6db79ed17c528b82d167)
[APPROVALNOTIFIER] This PR is NOT APPROVED
This pull-request has been approved by: simonfelding Once this PR has been reviewed and has the lgtm label, please assign aojea for approval. For more information see the Code Review Process.
The full list of commands accepted by this bot can be found here.
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment
Welcome @simonfelding!
It looks like this is your first PR to kubernetes-sigs/gateway-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/gateway-api has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @simonfelding. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
@howardjohn Totally agree it needs careful review and discussion of interactions! I haven't been able to think of any that aren't already supported in the major implementations, nginx for example, but there might be something I can't come up with.
Envoy also supports it and it sounds like the developers have already thought these issues through: (Wildcard hosts are supported in the suffix or prefix form).
The top level element in the routing configuration is a virtual host. Each virtual host has a logical name as well as a set of domains that get routed to it based on the incoming request’s host header. This allows a single listener to service multiple top level domain path trees. Once a virtual host is selected based on the domain, the routes are processed in order to see which upstream cluster to route to or whether to perform a redirect.
domains (repeated string, REQUIRED) A list of domains (host/authority header) that will be matched to this virtual host. Wildcard hosts are supported in the suffix or prefix form.
Domain search order:
- Exact domain names: www.foo.com.
- Suffix domain wildcards: *.foo.com or *-bar.foo.com.
- Prefix domain wildcards: foo.* or foo-*.
- Special wildcard * matching any domain.
Note The wildcard will not match the empty string. e.g. *-bar.foo.com will match baz-bar.foo.com but not -bar.foo.com. The longest wildcards match first. Only a single virtual host in the entire route configuration can match on *. A domain must be unique across all virtual hosts or the config will fail to load.
I don't think it has any interaction with certs? It's true that certs only match on a single wildcard subdomain; The gateway API already violates this by matching *.example.com with cringle.bingle.example.com if there is no better match. SNI is all about finding the right certificate for a given hostname before the cert is served and the TLS handshake begins.
Let's say a server has two virtual hosts, grafana.example.com and grafana.example.org. When a HTTPS request comes in, the client_hello stage is still unencrypted and contains the requested hostname.
In the case of Host: grafana.example.com and a HTTPRoute matching the exact hostname, it's obvious what cert to serve, in the case of Host: grafana.another.com, it should be rejected as there are no matching certs. In the case of Host: grafana.example.org, the principle of the best match should be followed, where a HTTPRoute with grafana.example.org and .*.example.org is a better match than grafana.*. This is how Envoy does it too. Either way, this decision is made before the cert is served.
Envoy supports it for HTTP Host headers, but the API you are changing here impacts SNI where its not supported: https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/listener/v3/listener_components.proto#config-listener-v3-filterchainmatch
It's not in the given examples in that document, but I don't see why it wouldn't work? SNI is just a simple string extracted from the client_hello.
I'm not deep enough in the envoy code to be sure, but from a superficial look at the code I can't see any validation that treats the SNI as a special type of hostname string that cannot be wildcard-matched like other hostnames can.
When searching for a virtual server by name, if name matches more than one of the specified variants, e.g. both wildcard name and regular expression match, the first matching variant will be chosen, in the following order of precedence:
- exact name
- longest wildcard name starting with an asterisk, e.g. “*.example.org”
- longest wildcard name ending with an asterisk, e.g. “mail.*”
- first matching regular expression (in order of appearance in a configuration file)
The name is determined in the following order:
Virtual server selection First, a connection is created in a default server context. Then, the server name can be determined in the following request processing stages, each involved in server configuration selection:
during SSL handshake, in advance, according to SNI
after processing the request line
after processing the Host header field
if the server name was not determined after processing the request line or from the Host header field, nginx will use the empty name as the server name.
At each of these stages, different server configurations can be applied.
I also know for sure that the Citrix Netscaler has no problem treating the client hello SNI string as any other string and making complex routing decisions based on that - prior to serving a cert.
tl;dr: SNI just means sending the hostname before the cert is served, which makes routing a TLS connected work like unencrypted HTTP. After hostname is matched with a server (by SNI or by Host header), a cert is served. I'm pretty confident this GEP does not impact SNI - it's only about whether or not we can allow a Gateway implementation to match a hostname to a string ending with a wildcard :)
I'm not deep enough in the envoy code to be sure, but from a superficial look at the code I can't see any validation that treats the SNI as a special type of hostname string that cannot be wildcard-matched like other hostnames can.
https://github.com/envoyproxy/envoy/blob/8784289b97003c3b2a36824b4a810112d62b7bfa/source/common/listener_manager/filter_chain_manager_impl.cc#L121-L123
I'm pretty confident this GEP does not impact SNI
Its changing the valid set of matchers against SNI so it is.
Dang, nice find. Youre right, Envoy does not support it right now because of that line. Deserves a PR to that as well :)
But anyway, why should the Gateway API decide whether or not the implementation can get the wildcard-suffixed hostname? I think it's pretty clear that nginx supports it for example, and it looks like to me that it's just a matter of a submitting a minor patch to make envoy support it too.
Thanks for this PR @simonfelding, I appreciate you being so proactive in working on solving a problem you've identified.
However, there are a few problems with accepting this as is, as @howardjohn said earlier. There's the "Can data planes even do this?" question, which it seems that some may be able to, and the "hostname also matches SNI, and SNI can't do that" problem.
However, the larger problem for me is the one about the interaction between the hostname field on the Gateway Listener, and the hostname field on a HTTPRoute.
These fields have a reasonably complex interaction already:
- The
hostnamefield on Gateway Listener specifies which hostnames should be allowed by that Listener, covering both SNI and Host header matches. - The
hostnamefield on HTTPRoute specifies which hostnames should be allowed by that specific HTTPRoute, and is also used for picking which HTTPRoutes can attach to which Gateway Listeners. - When both Gateway Listener
hostnameand HTTPRoutehostnameare specified, then they need to intersect, where intersection is defined in the HTTPRoutehostnamefield https://github.com/kubernetes-sigs/gateway-api/blob/8c7f33516cf0b7fa4d139b004fdbcf1a904dc2ef/apis/v1/httproute_types.go#L59-L114 And there is some Listener specific definitions at https://github.com/kubernetes-sigs/gateway-api/blob/8c7f33516cf0b7fa4d139b004fdbcf1a904dc2ef/apis/v1/gateway_types.go#L337-L385. - The reason that we picked label-based, leftward-only wildcard match is that it makes this interaction easier to define - in particular, using whole labels rather than a string match makes defining intersection much easier (although it makes it harder for many implementations to actually implement).
Adding a rightwards, string-wise match also requires understanding and updating all of that documentation, and the required conformance tests (particularly ones like https://github.com/kubernetes-sigs/gateway-api/blob/main/conformance/tests/gateway-http-listener-isolation.go that test how the hostname intersection works).
I'll be honest here, I think that the decrease in amount of required config may not be worth the increase in complexity this would entail, as users of the API often take a long while to understand this intersection as it is now.
That said, I'd be happy to be proven wrong here, so I encourage you to take a look at those places, and look at adding language to handle the interactions.
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle stale - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Mark this PR as fresh with
/remove-lifecycle rotten - Close this PR with
/close - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied - After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied - After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closed
You can:
- Reopen this PR with
/reopen - Mark this PR as fresh with
/remove-lifecycle rotten - Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
@k8s-triage-robot: Closed this PR.
In response to this:
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.
This bot triages PRs according to the following rules:
- After 90d of inactivity,
lifecycle/staleis applied- After 30d of inactivity since
lifecycle/stalewas applied,lifecycle/rottenis applied- After 30d of inactivity since
lifecycle/rottenwas applied, the PR is closedYou can:
- Reopen this PR with
/reopen- Mark this PR as fresh with
/remove-lifecycle rotten- Offer to help out with Issue Triage
Please send feedback to sig-contributor-experience at kubernetes/community.
/close
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.