ingress-nginx Support Kubernetes EndpointSlices

Support Kubernetes EndpointSlices. A newer feature in Kubernetes that allows restricting or customizing where traffic is sent in a Kubernetes cluster.

Background:

https://stackoverflow.com/questions/63399080/kubernetes-1-18-6-servicetopology-and-ingress-support

Not that I know of

K8s 1.17 and above (Beta): https://kubernetes.io/docs/concepts/services-networking/endpoint-slices

/kind feature

Aug 13 '20 19:08 raravena80

@raravena80 what are you trying to do exactly?

Aug 13 '20 20:08 aledbf

K8s 1.17 and above (Beta): https://kubernetes.io/docs/concepts/services-networking/endpoint-slices

Right, but for some context, the majority of the users are still running k8s < 1.16, even 1.13. Adding a feature like this one only adds complexity to the project.

Without a clear problem this feature could solve, I don't see the reason to add support, at least until users run k8s > 1.17

Aug 13 '20 20:08 aledbf

@aledbf this is based on this Stackoverflow question: https://stackoverflow.com/questions/63399080/kubernetes-1-18-6-servicetopology-and-ingress-support

Thanks!

Aug 13 '20 21:08 raravena80

this is based on this Stackoverflow question: https://stackoverflow.com/questions/63399080/kubernetes-1-18-6-servicetopology-and-ingress-support

Interesting.

The question itself, about service topology, can be solved using the annotation service-upstream That said, the source of the connection will be ingress-nginx, delegating the topology part to the k8s service topology feature (topologyKeys). But then you cannot have custom LB algorithms or sticky sessions.

The EndpointSlices part makes sense when you have services will lot of endpoints (> 100).

Aug 13 '20 21:08 aledbf

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

Nov 11 '20 21:11 fejta-bot

/remove-lifecycle stale

Nov 11 '20 21:11 raravena80

Endpoints slices are game changer not only for the scalability benefits they bring for services with a lot of endpoints, they also bring performance improvements and cost savings in cloud environment like aws.

It is possible to group endpoints per availability zone a based on the identity of the nginx pod you can prefer the endpoints in your zone instead of the others across zone. This saves you money and boost perfs because of the traffic staying in the same avz.

Dec 15 '20 05:12 ltagliamonte-dd

this is based on this Stackoverflow question: https://stackoverflow.com/questions/63399080/kubernetes-1-18-6-servicetopology-and-ingress-support

Interesting.

The question itself, about service topology, can be solved using the annotation service-upstream That said, the source of the connection will be ingress-nginx, delegating the topology part to the k8s service topology feature (topologyKeys). But then you cannot have custom LB algorithms or sticky sessions.

The EndpointSlices part makes sense when you have services will lot of endpoints (> 100).

Please correct me @aledbf, but I believe it would make sense to consider endpoint slices and topology aware routing in this project as well. K8s services are kind of difficult to use in a HTTP/2 context eg. when using gRPC due to it's connection reuse/multiplexing. There is the possibility to use a headless service and DNS based client load balancing but this also comes with some issues eg. getting notice of new pods (can be possibility by lower the TTL). The only clean solution here is working on endpoints directly. On the client side, there is a project for this https://github.com/sercand/kuberesolver, even though it does not have support for endpoint slices and topology aware routing yet.

So if we wan't to have a topology aware routing (which does make sense for many cases, especially cost reduction in a multi AZ environment) for HTTP/2 we might need to include some logic working on endpoint slices and certain routing preferences.

See also: https://github.com/zalando/skipper/issues/1446 https://github.com/linkerd/linkerd2/pull/4780

Jan 11 '21 17:01 ecktom

So if we wan't to have a topology-aware routing (which does make sense for many cases, especially cost reduction in a multi AZ environment) for HTTP/2 we might need to include some logic working on endpoint slices and certain routing preferences.

We have a KEP to add support for zone aware routing but such a feature requires massive changes in the lua side of the controller.

Using topology-aware routing (from k8s) means you lose several features from ingress-nginx, like sticky sessions, due to the use of the k8s service abstraction instead of endpoints.

Jan 11 '21 18:01 aledbf

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale

Apr 11 '21 19:04 fejta-bot

/remove-lifecycle stale.

Apr 11 '21 20:04 raravena80

Stale issues rot after 30d of inactivity. Mark the issue as fresh with /remove-lifecycle rotten. Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten

May 11 '21 20:05 fejta-bot

/remove-lifecycle rotten.

May 11 '21 20:05 raravena80

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community. /close

Jun 10 '21 20:06 fejta-bot

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity. Reopen the issue with /reopen. Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-contributor-experience at kubernetes/community. /close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jun 10 '21 20:06 k8s-ci-robot

Let's not forget that Endpoints resources are going to be deprecated very soon.

Jun 10 '21 21:06 ltagliamonte-dd

/reopen

This is still not fixed and one can hit K8s control plane availability problems when there's a high churn on large services in the cluster and lots of ingress-nginx-controller replicas - apiserver needs to send notifications about endpoints changes to lots of watchers which often ends up with its overload.

Jun 20 '22 09:06 tosi3k

@tosi3k: Reopened this issue.

In response to this:

/reopen

This is still not fixed and one can hit K8s control plane availability problems when there's a high churn on large services in the cluster and lots of ingress-nginx-controller replicas - apiserver needs to send notifications about endpoints changes to lots of watchers which often ends up with its overload.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jun 20 '22 09:06 k8s-ci-robot

The lack of EndpointSlices implementation is unfortunately impacting production for us now. Since Kubernetes v1.22, Services that exceed 1000 Pods/network endpoints, Endpoints are now being truncated to a maximum of 1000 items.

Jun 29 '22 17:06 ottoyiu

/priority backlog /triage accepted /project Stabilization Project

Jul 07 '22 15:07 strongjz

@strongjz: You must be a member of the kubernetes/ingress-nginx github team to set the project and column.

In response to this:

/priority backlog /triage accepted /project Stabilization Project

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Jul 07 '22 15:07 k8s-ci-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Aug 06 '22 16:08 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Aug 06 '22 16:08 k8s-ci-robot

/reopen

Aug 08 '22 07:08 tosi3k

@tosi3k: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Aug 08 '22 07:08 k8s-ci-robot

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen
Mark this issue or PR as fresh with /remove-lifecycle rotten
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Sep 07 '22 07:09 k8s-triage-robot

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied

After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied

After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Reopen this issue or PR with /reopen

Mark this issue or PR as fresh with /remove-lifecycle rotten

Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 07 '22 07:09 k8s-ci-robot

/reopen /remove-lifecycle rotten /lifecycle frozen

Sep 07 '22 08:09 tosi3k

@tosi3k: Reopened this issue.

In response to this:

/reopen /remove-lifecycle rotten /lifecycle frozen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sep 07 '22 08:09 k8s-ci-robot

https://github.com/kubernetes/ingress-nginx/pull/8890 is currently working on this feature

/lifecycle frozen

Sep 07 '22 13:09 strongjz

ingress-nginx ingress-nginx copied to clipboard

Support Kubernetes EndpointSlices

ingress-nginx
ingress-nginx copied to clipboard