ingress-nginx icon indicating copy to clipboard operation
ingress-nginx copied to clipboard

feat: switch from endpoints to endpointslices

Open tombokombo opened this issue 2 years ago • 34 comments

What this PR does / why we need it:

This PR change data source for nginx upstreams from endpoints to endpointSlices. It drops endpoints as they are mirrored to slices. If this PR is acceptable, I will also implement support for topology-aware-hints on the top of it. Thx

Types of changes

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [ ] New feature (non-breaking change which adds functionality)
  • [X] Breaking change (fix or feature that would cause existing functionality to change)
  • [ ] Documentation only

Which issue/s this PR fixes

fixes #6017

How Has This Been Tested?

Checklist:

  • [X] My change requires a change to the documentation.
  • [ ] I have updated the documentation accordingly.
  • [X] I've read the CONTRIBUTION guide
  • [X] I have added tests to cover my changes.
  • [X] All new and existing tests passed.

tombokombo avatar Jul 31 '22 22:07 tombokombo

Hi @tombokombo. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Jul 31 '22 22:07 k8s-ci-robot

/ok-to-test

strongjz avatar Aug 01 '22 12:08 strongjz

@rikatz this would be a good one for 2.0.0 release as well.

strongjz avatar Aug 01 '22 12:08 strongjz

YES /triage accepted

rikatz avatar Aug 01 '22 21:08 rikatz

@tombokombo thanks for this!

I would just like to ask that we split the EndpointSlice impl (this) from the topology hint :) This way we can have different PRs for the features.

Other than that, this looks great and I really appreciate you worked on this!

rikatz avatar Aug 01 '22 22:08 rikatz

@rikatz Yes, I meant it like that, this PR implements slices and next one topology hints. I have nothing more to this PR, maybe bumping chart version, because of rbac changes.

tombokombo avatar Aug 01 '22 22:08 tombokombo

No no, let’s not bump yet, but lgtm

i will do another review later

rikatz avatar Aug 01 '22 22:08 rikatz

/kind feature /priority backlog

strongjz avatar Aug 02 '22 13:08 strongjz

is this really a breaking change? Endpoint slices went stable in k8s 1.21 and were "Supporting" back to 1.21.

strongjz avatar Aug 09 '22 17:08 strongjz

is this really a breaking change? Endpoint slices went stable in k8s 1.21 and were "Supporting" back to 1.21.

Depends on "breaking change" definition, it could potentially cause problems in some edge cases, eg. if someone is using endpointslice.kubernetes.io/skip-mirror annotation. But should work transparently in most cases.

tombokombo avatar Aug 09 '22 22:08 tombokombo

Hi guys, it's without any action for a while. Is there anything I can do to push this PR?

tombokombo avatar Aug 19 '22 23:08 tombokombo

Hey @tombokombo

i’m on vacation but promise as soon as I’m back will take a look

rikatz avatar Aug 19 '22 23:08 rikatz

Hi @rikatz / @tombokombo Can we please release this soon? We are also blocked on this, unable to use more than 1000 endpoints in a particular service. Is there any annotation/way that can unblock this at an nginx level?

abhiroop93 avatar Sep 08 '22 12:09 abhiroop93

Hi @abhiroop93
I would love to see it merged and released, but it's not on me.

tombokombo avatar Sep 09 '22 10:09 tombokombo

/assign

tao12345666333 avatar Sep 09 '22 10:09 tao12345666333

@strongjz @cpanato Can you please review this? 😅

abhiroop93 avatar Sep 12 '22 07:09 abhiroop93

Although all Kubernetes versions supported by the current ingress-nginx project already support the endpointslice feature, I would like to discuss whether we should add a configuration item that allows users to switch between endpoints and endpointslice

I would bump chart required kubeVersion from v1.20 to v1.21 ( stable EndpointSlices ) and that's it. The reason in simple, endpoint resource is truncated to 1000 endpoints since v1.22, so in a bigger clusters it's not working properly.

tombokombo avatar Sep 12 '22 13:09 tombokombo

@tao12345666333 do you need a hand reviewing this? FWIW we have been running ingress-nginx with this commit for a while without any issue.

ElvinEfendi avatar Sep 12 '22 15:09 ElvinEfendi

Although all Kubernetes versions supported by the current ingress-nginx project already support the endpointslice feature, I would like to discuss whether we should add a configuration item that allows users to switch between endpoints and endpointslice

If all of them supports it, what are your reservations?

ElvinEfendi avatar Sep 12 '22 15:09 ElvinEfendi

@ElvinEfendi @tombokombo After I thought about it, it occurred to me that it is safe for us to migrate to endpointslice now, this feature has reached stable since v1.21.

@tao12345666333 do you need a hand reviewing this? FWIW we have been running ingress-nginx with this commit for a while without any issue.

It would be great if you could help :cat:

tao12345666333 avatar Sep 12 '22 16:09 tao12345666333

@wbean1 @kalindudc can you share your reviews / test results from our private fork of this commit here?

ElvinEfendi avatar Sep 15 '22 07:09 ElvinEfendi

Tl;dr: Works as expected. Controller was able to see >1000 endpoints and used all to route traffic.

Tested this commit on a private fork

I deployed this change to a test controller and scaled up a test application behind this controller to >1000 pods. The change was successful and the controller was able to see >1000 endpoints.

❯ kcl --context apps-b-us-east1-12 -n routing-foundations-misc-production-unrestricted-qcg6 get deploy
NAME   READY       UP-TO-DATE   AVAILABLE   AGE
web    1015/1015   1015         1015        122d

❯ kngx --context apps-b-us-east1-12 -n sfe-controller backends | jq '.[] | select(.name=="routing-foundations-misc-production-unrestricted-qcg6-web-80") | .endpoints[].address' | wc -l
    1015

To further confirm this change I generated some load for this application and observed that the controller did in-fact pick >1000 unique upstreams to proxy the requests. image

Note: The above image shows 1022 upstreams instead of 1015 is because a node was removed due to a scale down, which removed and brought up some pods from the application

Additional tests

  • Rolling restarts: I ran rolling restart / configuration reload for this controller under some load and the k8s update behaviour did not cause any problems. The app continued to receive traffic and the controller successfully rolled out new pods without any downtime.

image image

kalindudc avatar Sep 15 '22 16:09 kalindudc

With those data points, I feel confident about releasing this feature, probably for 1.4.0, then we can cut a release branch as we discussed in today's community meeting.

@tao12345666333 @rikatz ?

strongjz avatar Sep 15 '22 19:09 strongjz

@tao12345666333 @rikatz @strongjz Any clue when would we be able to get this?

abhiroop93 avatar Sep 20 '22 07:09 abhiroop93

Any clue when would we be able to get this?

yes.

@kalindudc @ElvinEfendi Thanks for your help!

tao12345666333 avatar Sep 20 '22 11:09 tao12345666333

I want to take a final look at the code today.

tao12345666333 avatar Sep 20 '22 11:09 tao12345666333

@tao12345666333 Sorry to bug you again about this, but can you please have a look here? We are really stuck because we are not able to use services running more than 1000 pods.

abhiroop93 avatar Sep 22 '22 04:09 abhiroop93

Hi @abhiroop93, just as a curiosity, is it ok for you to describe your use-case that requires over 1000 endpoints per service

longwuyuan avatar Sep 22 '22 04:09 longwuyuan

@longwuyuan We have heavy throughput on some of our nodejs applications, in which case we have to run more than 1000 pods for those workloads. And since we have migrated to 1.21+, endpoint constructs do not support more than 1000 endpoints.

abhiroop93 avatar Sep 22 '22 05:09 abhiroop93

Sorry, I am indeed reviewing the code. It's a pity I don't have a full time to do it quickly :worried:

tao12345666333 avatar Sep 22 '22 06:09 tao12345666333