serving icon indicating copy to clipboard operation
serving copied to clipboard

Add Support for EndpointSlices

Open robscott opened this issue 5 years ago • 22 comments

/area networking /kind feature

Describe the feature

It would be great if Knative could support EndpointSlices. This is a new Kubernetes resource that provides a scalable and extensible alternative to Endpoints. Although there may be more cases where this pattern is used, I've noticed that Knative manually creates Endpoints in some cases. With the EndpointSliceProxying feature gate enabled, kube-proxy will read exclusively from EndpointSlices, unfortunately disregarding any Endpoints without any corresponding EndpointSlices and effectively breaking parts of Knative. I've been the main developer for this feature in Kubernetes and I'm happy to answer any questions or help anyone that's interesting in adding support for this.

/cc @Cynocracy

robscott avatar Apr 23 '20 22:04 robscott

cc @vagababov @markusthoemmes

tcnghia avatar Apr 23 '20 22:04 tcnghia

/assign

It's quite interesting that this was the exact question I asked in Barcelona during the presentation of the API (what will happen to the APIs that manually create endpoints) and it was completely ignored by the presenters :-)

vagababov avatar Apr 23 '20 22:04 vagababov

Now to the actual problem at hand, @robscott , do we basically have to create both endpoints and endpointslices mirroring the contents?

vagababov avatar Apr 23 '20 22:04 vagababov

@vagababov Yep, that's what would be needed. We're also looking into ways we can mirror custom Endpoints to EndpointSlices automatically, but any automatic approach will be imperfect. As an example, EndpointSlices have topology fields that can't be derived from Endpoints resources. On the other hand, if you're not interested in anything specific with EndpointSlices, this potential automatic mirroring approach may suffice. I'll follow up here with more details as I have them.

robscott avatar Apr 23 '20 22:04 robscott

Yeah, I am not sure we have any use for topology just yet. Might be interesting in future with NLS work or to pick activators from the same zone as nodes, etc, but right now, just copy of IP addresses.

vagababov avatar Apr 23 '20 23:04 vagababov

I wondered about this too, especially looking at #7260. We could have a more efficient solution there, not relying on subset ordering in the endpoints object if we had an endpointslice just for the activator IPs potentially.

It's only beta in 1.17 though, what's the availability of the API currently?

markusthoemmes avatar Apr 24 '20 06:04 markusthoemmes

The API itself is enabled by default in 1.17, the controller is enabled by default in 1.18, and kube-proxy is not configured to use EndpointSlices by default yet in any version. For now this will only come up for users that intentionally enable the EndpointSliceProxying feature gate in 1.18 or EndpointSlice in feature gate on kube-proxy in 1.16-1.17. As I've looked into this further, it looks like we should be able to solve this issue automatically with some kind of mirroring in 1.19 (before this feature is enabled by default).

It may be worth adding some kind of note in the documentation here that Knative is currently incompatible with the EndpointSliceProxying feature gate. I'll also be updating EndpointSlice docs to note the issue, and hope to have a more automatic mirroring solution in place for 1.19. I'm happy to close this issue for now as well since it sounds like you actually shouldn't need to do anything for EndpointSlice integration.

robscott avatar Apr 24 '20 16:04 robscott

Let's keep it anyway, even if no integration will be required anyway. Since as Markus noted we might have some interesting usages for slices ourselves.

vagababov avatar Apr 24 '20 17:04 vagababov

Thanks for the help, Rob.

vagababov avatar Apr 24 '20 17:04 vagababov

Quick update on this feature. A new EndpointSliceMirroring controller has been added as part of the 1.19 release and along with that kube-proxy now reads from EndpointSlices on Linux by default.

robscott avatar Jul 16 '20 00:07 robscott

So basically this is a noop for us when ES are enabled by default?

vagababov avatar Jul 16 '20 19:07 vagababov

Yep, this should be a noop. It does mean there will be a slightly longer delay since the new mirroring controller watches Endpoints resources and then creates EndpointSlices. It also only supports mirroring up to 1000 IPs per subset.

robscott avatar Jul 16 '20 20:07 robscott

This issue is stale because it has been open for 90 days with no activity. It will automatically close after 30 more days of inactivity. Reopen the issue with /reopen. Mark the issue as fresh by adding the comment /remove-lifecycle stale.

github-actions[bot] avatar Oct 15 '20 01:10 github-actions[bot]

/reopen /lifecycle frozen

vagababov avatar Nov 15 '20 01:11 vagababov

@vagababov: Reopened this issue.

In response to this:

/reopen /lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

knative-prow-robot avatar Nov 15 '20 01:11 knative-prow-robot

/unassign

vagababov avatar Mar 12 '21 17:03 vagababov

As I read the issue:

  1. It's a no-op (and working post 1.19 with the mirroring controller)
  2. There's no one currently working on anything here
  3. We have some ideas in the future, but haven't written any of them down.

I think we should

/close

this issue, and open new ones when we have concrete proposals where using EndpointSlices directly might help.

evankanderson avatar Mar 22 '21 04:03 evankanderson

@evankanderson: Closing this issue.

In response to this:

As I read the issue:

  1. It's a no-op (and working post 1.19 with the mirroring controller)
  2. There's no one currently working on anything here
  3. We have some ideas in the future, but haven't written any of them down.

I think we should

/close

this issue, and open new ones when we have concrete proposals where using EndpointSlices directly might help.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

knative-prow-robot avatar Mar 22 '21 04:03 knative-prow-robot

/reopen

Users are hitting the 1000 endpoints limit so mirroring isn't working properly - https://cloud-native.slack.com/archives/C04LMU0AX60/p1726054917101579

I think switching over to EndpointSlices won't work with clusters using kubedns (which I believe GKE is using)

@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default?

dprotaso avatar Sep 19 '24 17:09 dprotaso

@dprotaso: Reopened this issue.

In response to this:

/reopen

Users are hitting the 1000 endpoints limit so mirroring isn't working properly - https://cloud-native.slack.com/archives/C04LMU0AX60/p1726054917101579

I think switching over to EndpointSlices won't work with clusters using kubedns (which I believe GKE is using)

@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

knative-prow[bot] avatar Sep 19 '24 17:09 knative-prow[bot]

We maybe be able to work around this by creating multiple subsets - see the docs here

https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/#endpointslice-mirroring

dprotaso avatar Sep 19 '24 17:09 dprotaso

@robscott is there any plans for GKE to move to CoreDNS or CloudDNS by default?

We recommend using Cloud DNS in GKE which does support EndpointSlices and is the default in at least Autopilot clusters (not sure about other modes).

robscott avatar Sep 19 '24 18:09 robscott