TLSRoute: Require hostnames
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR makes the hostnames field in TLSRoute non-optional. Hostnames is the only available match for TLSRoutes and, per the TLSRoute spec, at least one hostname match is required. If SNI hostname matching is not needed, TCPRoute should be used instead.
Which issue(s) this PR fixes:
Fixes #3871
Does this PR introduce a user-facing change?:
- Adds v1alpha3 experimental TLSRoute with the hostnames field being required.
Welcome @rostislavbobo!
It looks like this is your first PR to kubernetes-sigs/gateway-api 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.
You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.
You can also check if kubernetes-sigs/gateway-api has its own contribution guidelines.
You may want to refer to our testing guide if you run into trouble with your tests not passing.
If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!
Thank you, and welcome to Kubernetes. :smiley:
Hi @rostislavbobo. Thanks for your PR.
I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.
Thanks @rostislavbobo!
/ok-to-test
I think this is a good idea, but unfortunately, I think we need to add the new version, and move the v1alpha2 source files to mostly be aliases to the new objects (except for this change).
That will have the generator include both v1alpha2 and v1alpha3 into the generated YAML files.
Alternatively, we will need to very clearly document that folks must ensure that the v1alpha2 CRD objects are not removed from the cluster before the actual TLSRoute objects are migrated to v1alpha3, or all the TLSRoute objects will disappear.
This is because I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are, and we will screw them over very hard if we are not careful.
I think this is a good idea, but unfortunately, I think we need to add the new version, and move the
v1alpha2source files to mostly be aliases to the new objects (except for this change). That will have the generator include bothv1alpha2andv1alpha3into the generated YAML files.
+1
Alternatively, we will need to very clearly document that folks must ensure that the
v1alpha2CRD objects are not removed from the cluster before the actual TLSRoute objects are migrated tov1alpha3, or all the TLSRoute objects will disappear. This is because I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are, and we will screw them over very hard if we are not careful.
+1
I agree: as long as the costs are reasonably low, we should at least try to avoid breaking people on the API today.
Normally alpha APIs are fair game for complete disruption at any time, but TLSRoute, UDPRoute and TCPRoute have been in a unique "perma-alpha" for such a long time that I think if it's not a big lift, trying to make this easier for folks is fair.
/cc @rycieos
I know that people are using TLSRoute in production. I know they shouldn't be, but they definitely are
😆 yeah that's me. In my defense, I started using TLSRoutes in non-prod environments 2.5 years ago, hoping that it would graduate to GA along with the rest of Gateway API, at which point I would move it into prod. Obviously that didn't happen, and I recognize that taking that risk is on me.
I appreciate the work done to make such migrations painless, but I definitely understand that this part of the API is still in alpha and that disruptions are expected. I can't ask for more than keeping me in the loop, which the team has done great with so far.
I agree: as long as the costs are reasonably low, we should at least try to avoid breaking people on the API today.
Of course, @shaneutt , @youngnick , that's why we're not introducing this breaking change in v1alpha2 and are bumping the version to v1alpha3 instead.
unfortunately, I think we need to add the new version, and move the
v1alpha2source files to mostly be aliases to the new objects (except for this change). This is because I know that people are using TLSRoute in production.
I hear you, makes sense. Let me alias TLSRoutes v1alpha2.
I started using TLSRoutes in non-prod environments 2.5 years ago, hoping that it would graduate to GA along with the rest of Gateway API
This year I'm revamping TLSRoutes and hope this v1alpha3 finally sets it up for GA 🙂
When we're ready, I'll start coordinating this within the community and will keep you in the loop.
@Rycieos , let me know if there's anything else we should include.
@Rycieos , let me know if there's anything else we should include.
I have brought up #2111 a lot already, but I think it bears repeating. As @robscott pointed out in the original Slack thread for this PR (which I will quote here since Slack is dying soon),
If I'm understanding correctly, this doesn't require a change to TLSRoute API itself, but more expanding the interaction between a Gateway and TLSRoute to allow for the combination of TLS termination and TLSRoute? It is absolutely something we should keep in mind though. I think/hope this version rev to v1alpha3 would set us up for GA of TLSRoute, and I'd really like to have the interaction with Gateway well specified before that point.
So IMO, #2111 should come at the same time as v1alpha3, or at the very latest in v1beta1, assuming we don't skip the beta stage for TLSRoute. How that ties in to Gateway resource versions I don't think has been figured out yet.
I think we need to add the new version, and move the
v1alpha2source files to mostly be aliases to the new objects (except for this change).
@youngnick , v1alpha3 is already based on v1alpha2 (see v1alpha3/backendtlspolicy_types.go, v1alpha3/zz_generated.deepcopy.go). That means we can't reference v1alpha2 from v1alpha3. And I don't think we should change that, as this is the right way to handle versioning and aliasing in the v1alpha3 case.
I've put the new TLSRoute into v1alpha3 and just referenced reused v1alpha2 types (such as TLSRouteStatus and TLSRouteRule).
If we want to add all remaining XRoutes to v1alpha3, we'd need to keep the v1alpha2 XRoutes as they are, and then create new v1alpha3 XRoutes that reference these original v1alpha2 definitions. I personally don't see much value in doing this. What do you think?
@rostislavbobo Yeah, I think that's fine. The intent here was to end up with the generated YAML having both versions, to save users from a breaking change when they install the v1.4 YAMLs (it's possible to have a TLSRoute saved in v1alpha2 that will not be valid in v1alpha3). Also, we can't even make a conversion webhook for this case, because we can't guess what the hostname should be if it previously was not present.
So, we're going to have to break people who weren't already setting a hostname in the TLSRoute.
But, if we only include v1alpha3 as the storage version in the generated CRD YAML, then the existing v1alpha2 objects will just vanish from the cluster when the definition is updated (since they're no longer a valid version) - even if they were using a hostname already and would translate just fine into v1alpha3.
So I think that the options here are:
- include
v1alpha2with breaking changes between it andv1alpha3. This will break people who are not using ahostname, in thatv1alpha2objects cannot be cast intov1alpha3objects. tbh I'm not sure how the apiserver will handle this case. - don't include
v1alpha2in the YAMLs. This will mean that everyone who is currently using TLSRoute will need to back up their routes inv1alpha2before installing v1.4, change them tov1alpha3, making sure that ahostnameis being set, then put them back after installing v1.4.
We can test this to find out what would happen in the first case by:
- setup a cluster that has Gateway API v1.3 installed, including
v1alpha2TLSRoute - create some TLSRoute objects, one that does have
hostname, and one that does not. - install the YAML from this PR
- See if you can get any of the existing TLSRoute objects. (I don't think you will see them any more after that update).
/retitle TLSRoute: Require hostnames and bump version to v1alpha3
Thanks @rostislavbobo!
/approve /cc @mlavacca @shaneutt @youngnick
[APPROVALNOTIFIER] This PR is APPROVED
This pull-request has been approved by: robscott, rostislavbobo, shaneutt
The full list of commands accepted by this bot can be found here.
The pull request process is described here
- ~~OWNERS~~ [robscott,shaneutt]
Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment