aws-application-networking-k8s icon indicating copy to clipboard operation
aws-application-networking-k8s copied to clipboard

Update gateway api CRD versions?

Open vd-arnaud opened this issue 1 year ago • 11 comments

The current version (v1.0.5) of the gateway-api-controller chart comes with crds that are quite old:

  • GatewayClass: v1alpha2,v1beta1
  • Gateway: v1alpha2, v1beta1
  • HTTPRoute: v1alpha2, v1beta1
  • GRPCRoute: v1alpha2

Such CRDs are from this release which is more than one year old

The last release from kubernetes-sigs includes:

  • GatewayClass: v1, v1beta1
  • Gateway: v1, v1beta1
  • GRPCRoute: v1, v1alpha2
  • HTTPRoute: v1, v1beta1
  • ReferenceGrant: v1alpha2, v1beta1

It leads to issues because other actors in this ecosystem uses new CRDs version, for example the last version of external-dns is using HTTPRoute v1. So one had to update this particular CRD to be able to use gateway-api-controller AND external-dns. Hopefully the last HTTPRoute CRD still includes v1beta1, but for how long?

It would be great if you plan to update CRDs in future release 🙏

vd-arnaud avatar May 16 '24 15:05 vd-arnaud

If you install the v1 gateway API CRDs in your cluster by: https://github.com/aws/aws-application-networking-k8s/blob/1862bef9b5f4956b08f34b80723464a99682f542/docs/contributing/developer.md?plain=1#L44-L47 and run the v1.0.5 controller and create v1 Gateway, v1 HTTPRoute, what it happen? In the e2e test code we actually already used the V1 gateway api resource, for example: https://github.com/aws/aws-application-networking-k8s/blob/e85369a9808835f4eab31c88edb9bcc920870e2c/test/suites/integration/httproute_path_match_test.go#L29 and it can work for us.

But your suggestion really make sense, we need to install the v1 CRDs by default in the helm chart and use v1 CRDs in the controller code.

zijun726911 avatar May 16 '24 15:05 zijun726911

I did the test to install last release from kubernetes-sigs then to install gateway-api-controller v1.0.5 and I got some error in aws-gateway-controller-chart pods so I didn't go further.

Here is a sample of errors I've got:

{"level":"error","ts":"2024-05-16T17:13:48.717Z","logger":"runtime.controller-runtime.source.EventHandler","caller":"source/kind.go:68","msg":"failed to get informer from cache","error":"failed to get API group resources: unable to retrieve the complete list of server APIs: gateway.networking.k8s.io/v1alpha2: the server could not find the requested resource"}

which makes sense: gateway-api-controller tries to fetch gateway using v1alpha2 which is not available anymore with new CRDs. I "fixed" this error by downgrading Gateway CRD and I got similar error with GatewayClass

Thanks for your interest in this, please tell me if I can help 👍

vd-arnaud avatar May 16 '24 17:05 vd-arnaud

Hello, small update from our side, we are now using DNSEndpoint instead of HTTPRoute as a source for our external-dns configuration and it kind of solved the issue.

I still think it would be nice to have updated CRDs to prevent some similar issues in the future (the possibility to have differents piece of software using those CRDs is not null at all!)

vd-arnaud avatar May 23 '24 07:05 vd-arnaud

Any updates?

We're getting trouble with using other Gateway Controller such as istio.

DingGGu avatar Jul 15 '24 10:07 DingGGu

Any updates?

We're getting trouble with using other Gateway Controller such as istio.

same issue

seongpil0948 avatar Jul 22 '24 09:07 seongpil0948

@DingGGu or @seongpil0948 can you possibly share the error you're seeing and a little more about your setup?

I did some testing a while back to try to better understand how version mismatches were handled, using different configurations (alpha installed, but apply YAMLs with v1, and vice versa) and it all seemed to "just work". From what I could tell, kubernetes plumbing takes care of translating the requested version to whatever the local process knows.

The only issue I noticed was that the "mock" kubernetes client (mock_client.NewMockClient(c)) only handles explicit API versions, so unit tests don't translate objects across versions like the actual API does.

erikfuller avatar Jul 26 '24 21:07 erikfuller

What would really help are steps to repro this, if you have them. Agree that it would be ideal to move to latest versions, just need to ensure the upgrade path is smooth.

erikfuller avatar Jul 26 '24 21:07 erikfuller

Hi @erikfuller!

  1. Install GatewayAPI CRD 1.1.0, such as before using k8s Gateway with istio https://istio.io/latest/docs/tasks/traffic-management/ingress/gateway-api/#setup
kubectl kustomize "github.com/kubernetes-sigs/gateway-api/config/crd?ref=v1.1.0" | kubectl apply -f -
  1. Install Lattice controller via Helm
helm install -n kube-system gateway-api-controller aws-gateway-controller-chart \
  -f values.yaml \
  --version "1.0.6" 

Error logs in controller:

{"level":"error","ts":"2024-07-29T00:52:28.215Z","logger":"runtime.controller-runtime.source.EventHandler","caller":"source/kind.go:63","msg":"if kind is a CRD, it should be installed before calling Start","kind":"GRPCRoute.gateway.networking.k8s.io","error":"no matches for kind \"GRPCRoute\" in version \"gateway.networking.k8s.io/v1alpha2\""}

However, the cluster has GRPCRoute v1.

$ kubectl api-resources | grep grpc
grpcroutes                                                             gateway.networking.k8s.io/v1              true         GRPCRoute

DingGGu avatar Jul 29 '24 00:07 DingGGu

Thanks, @DingGGu that's super helpful. I'm going to be looking further into this one and hope to share some updates in the next week or so.

erikfuller avatar Sep 24 '24 23:09 erikfuller

I was able to get a local repro going. It looks like v1 for GRPCRoute has served:false on the v1alpha2 version, which is why it isn't automatically translated by the Kubernetes API. Will look at options to resolve this.

erikfuller avatar Oct 05 '24 00:10 erikfuller

Why for the gods sake this controller distributes other projects CRDs at all? Neither External DNS nor Kubernetes Gateway API have any relation to this project. Just don't install GW API CRDs please!

savealive avatar Oct 06 '24 02:10 savealive

Draft PR is out. Looking to get some feedback on the upgrade process from folks with an existing deployment:

Recommended upgrade steps (from controller v1.0.X):

  1. Back up configuration, especially GRPCRoute objects
  2. Disable v1.0.x controller (e.g. scale to zero)
  3. Update to GW API v1.1.0 - this includes deprecated v1alpha2 version, can also just update GRPCRoute CRD
  4. Save GRPCRoute objects as YAML, modify API version to v1
  5. Apply changes to GRPCRoute objects (now on v1)
  6. Update to GW API v1.2.0 (optional)
  7. Deploy and launch new controller version (v1.1.0 - not yet released)

erikfuller avatar Nov 29 '24 19:11 erikfuller

Code and docs merged. Will be available in next release (v1.1.0)

erikfuller avatar Dec 13 '24 22:12 erikfuller