contour icon indicating copy to clipboard operation
contour copied to clipboard

EndpointSlice Support

Open andrewsykim opened this issue 4 years ago • 11 comments

EndpointSlice is now enabled by default in Kubernetes and offers some solid performance benefits for clusters with large Endpoint sets. It might be worthwhile to support consuming EndpointSlice, at least as optional.

andrewsykim avatar Jul 17 '20 16:07 andrewsykim

Thanks for raising this @andrewsykim, totally agree that we should support consuming EndpointSlice in some fashion.

youngnick avatar Jul 19 '20 23:07 youngnick

Hi Team. Agree that Contour should support EndpointSlice in the future. I just discussed with @stevesloka and would like to start working on an experimental implementation. Will send you an update once it's done.

zianke avatar Jul 29 '20 20:07 zianke

@zianke has a branch which implements endpoint slices (https://github.com/projectcontour/contour/compare/master...zianke:endpointslice) which is fantastic! 🎉

Before pushing up a PR to implement to better help shape the PR, when would we want to add support for EndpointSlices? It's still beta in Kubernetes v1.17, so we. can't fully move to it. Should probably add a feature flag to enable which would disable the current Endpoints implementation.

After we get this merged we'd also need to build out some E2E tests which mock up the behavior.

stevesloka avatar Aug 04 '20 15:08 stevesloka

Great work on getting something done so quickly @zianke!

However, given that we're working on the Endpoint translator at the moment, this may be something that will need refactoring if we change anything about EndpointTranslator or ClusterLoadAssignment.

I'd also like to do our best not to add feature flags for enabling Kinds - in particular, EndpointSlice is designed so that if it's available, you can use either it or Endpoints, relatively transparently. I'd prefer to see detection of enabled Kinds used rather than a feature flag.

youngnick avatar Aug 04 '20 22:08 youngnick

Hey following up what's the status of EndpointSlices? and does Contour currently support External Service Routing with EndpointSlices with the CNAME address type?

dprotaso avatar Aug 24 '21 20:08 dprotaso

Thinking more about this, I think that we should confirm what K8s version EndpointSlice is GA in, and if that's less than our minimum supported (which it almost certainly is), we should move to using EndpointSlices exclusively (since Kube will ensure that changes are propagated as necessary back to Endpoints).

There are some things that wee may need to be careful about, since I believe that EndpointSlices can be used for other stuff (like External Service Routing and locality routing), so we will also need to validate if we need to do anything special with the resource, or if we can just migrate the simple logic we have now to the new resource.

But I agree we should look at this one soon. @xaleeks, what do you think?

youngnick avatar Aug 30 '21 06:08 youngnick

I think that we should confirm what K8s version EndpointSlice is GA

Beta in 1.17, GA in 1.21

GA doesn't necessarily mean all things are supported since it depends on the DNS provider for the cluster. I think it just means the API is stable.

For example - CoreDNS EndpointSlice support was added 1.8.4 which is bundled with K8s 1.22 (via kubeadm).

Note you may need to watch a subset of Endpoints objects that aren't mirrored as EndpointSlices - details here: https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#endpointslice-mirroring

dprotaso avatar Aug 30 '21 14:08 dprotaso

@youngnick Yup, we should support this. However, EndpointSlice started in k8s 1.17 it seems. Do we need to add a check that the version of K8s running supports it before Contour is deployed? I guess all the supported versions of Contour today would necessitate a version of k8s that supports it so we should be in the clear.

Tagging it for 1.20 for now to start investigating but we can also slot this in the 1.21

xaleeks avatar Aug 31 '21 18:08 xaleeks

Also according to https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/, it allows for setting a "topology.kubernetes.io/zone" key. Which opens us to this https://kubernetes.io/docs/concepts/services-networking/service-topology/

So a client can hit his preferred node specified in the topologyKeys for his service when using Ingress . I’m not sure if that means this ServiceTopology feature is unsupported unless we support EndpointSlice. Interesting opportunity for anyone looking to utilize this.

xaleeks avatar Aug 31 '21 18:08 xaleeks

If we are already mandating a newer version of Kubernetes than 1.17 (which I think we are), then we just need to call out when we start doing EndpointSlices that 1.17 (or whatever version) is now required.

The zone topology is an interesting idea, yeah.

youngnick avatar Sep 01 '21 05:09 youngnick

topologyKeys is deprecated fwiw. replaced by https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/

howardjohn avatar Sep 01 '21 15:09 howardjohn

I can take this up. I was also able to write an EndpointSliceTranslator, however I am a bit confused about the part where we observe the k8s resources (endpoints etc.). The name of EndpointSlice has an additional suffix and I am not really sure if that will come into picture while reading the slice. If someone could help me with any dev document or simply where to look in the code for reading the k8s resources - that would be helpful.

arjunsalyan avatar Feb 01 '23 05:02 arjunsalyan

Related Gateway Issue hopes to exercise different Service/Endpoint[Slice] variants - https://github.com/kubernetes-sigs/gateway-api/issues/1718

dprotaso avatar Feb 22 '23 21:02 dprotaso

Also related EndpointSlices/FQDN could be a replacement for Service type=ExternalName

https://github.com/projectcontour/contour/issues/3950

dprotaso avatar Feb 22 '23 21:02 dprotaso

Confirming that this appears to create a hard cap of 1000 replicas that contour can track for a given service, as non-sliced endpoints are limited to 1000 members.

orenwolf avatar Mar 15 '23 23:03 orenwolf

This would also allow usage of Topology Aware Routing, which was added back in Kubernetes as EndpointSlice hints since 1.23.

Quentin-M avatar Jun 23 '23 03:06 Quentin-M

Will plan to work on adding support for this.

rajatvig avatar Jun 23 '23 23:06 rajatvig