ingress-nginx
ingress-nginx copied to clipboard
feat: switch from endpoints to endpointslices
What this PR does / why we need it:
This PR change data source for nginx upstreams from endpoints to endpointSlices. It drops endpoints as they are mirrored to slices. If this PR is acceptable, I will also implement support for topology-aware-hints on the top of it. Thx
Types of changes
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [X] Breaking change (fix or feature that would cause existing functionality to change)
- [ ] Documentation only
Which issue/s this PR fixes
fixes #6017
How Has This Been Tested?
Checklist:
- [X] My change requires a change to the documentation.
- [ ] I have updated the documentation accordingly.
- [X] I've read the CONTRIBUTION guide
- [X] I have added tests to cover my changes.
- [X] All new and existing tests passed.
Hi @tombokombo. Thanks for your PR.
I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test
on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.
Once the patch is verified, the new status will be reflected by the ok-to-test
label.
I understand the commands that are listed here.
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.
/ok-to-test
@rikatz this would be a good one for 2.0.0 release as well.
YES /triage accepted
@tombokombo thanks for this!
I would just like to ask that we split the EndpointSlice impl (this) from the topology hint :) This way we can have different PRs for the features.
Other than that, this looks great and I really appreciate you worked on this!
@rikatz Yes, I meant it like that, this PR implements slices and next one topology hints. I have nothing more to this PR, maybe bumping chart version, because of rbac changes.
No no, let’s not bump yet, but lgtm
i will do another review later
/kind feature /priority backlog
is this really a breaking change? Endpoint slices went stable in k8s 1.21 and were "Supporting" back to 1.21.
is this really a breaking change? Endpoint slices went stable in k8s 1.21 and were "Supporting" back to 1.21.
Depends on "breaking change" definition, it could potentially cause problems in some edge cases, eg. if someone is using endpointslice.kubernetes.io/skip-mirror
annotation. But should work transparently in most cases.
Hi guys, it's without any action for a while. Is there anything I can do to push this PR?
Hey @tombokombo
i’m on vacation but promise as soon as I’m back will take a look
Hi @rikatz / @tombokombo Can we please release this soon? We are also blocked on this, unable to use more than 1000 endpoints in a particular service. Is there any annotation/way that can unblock this at an nginx level?
Hi @abhiroop93
I would love to see it merged and released, but it's not on me.
/assign
@strongjz @cpanato Can you please review this? 😅
Although all Kubernetes versions supported by the current ingress-nginx project already support the endpointslice feature, I would like to discuss whether we should add a configuration item that allows users to switch between endpoints and endpointslice
I would bump chart required kubeVersion from v1.20 to v1.21 ( stable EndpointSlices ) and that's it. The reason in simple, endpoint resource is truncated to 1000 endpoints since v1.22, so in a bigger clusters it's not working properly.
@tao12345666333 do you need a hand reviewing this? FWIW we have been running ingress-nginx with this commit for a while without any issue.
Although all Kubernetes versions supported by the current ingress-nginx project already support the endpointslice feature, I would like to discuss whether we should add a configuration item that allows users to switch between endpoints and endpointslice
If all of them supports it, what are your reservations?
@ElvinEfendi @tombokombo After I thought about it, it occurred to me that it is safe for us to migrate to endpointslice now, this feature has reached stable since v1.21.
@tao12345666333 do you need a hand reviewing this? FWIW we have been running ingress-nginx with this commit for a while without any issue.
It would be great if you could help :cat:
@wbean1 @kalindudc can you share your reviews / test results from our private fork of this commit here?
Tl;dr: Works as expected. Controller was able to see >1000 endpoints and used all to route traffic.
Tested this commit on a private fork
I deployed this change to a test controller and scaled up a test application behind this controller to >1000 pods. The change was successful and the controller was able to see >1000 endpoints.
❯ kcl --context apps-b-us-east1-12 -n routing-foundations-misc-production-unrestricted-qcg6 get deploy
NAME READY UP-TO-DATE AVAILABLE AGE
web 1015/1015 1015 1015 122d
❯ kngx --context apps-b-us-east1-12 -n sfe-controller backends | jq '.[] | select(.name=="routing-foundations-misc-production-unrestricted-qcg6-web-80") | .endpoints[].address' | wc -l
1015
To further confirm this change I generated some load for this application and observed that the controller did in-fact pick >1000 unique upstreams to proxy the requests.
Note: The above image shows 1022 upstreams instead of 1015 is because a node was removed due to a scale down, which removed and brought up some pods from the application
Additional tests
- Rolling restarts: I ran rolling restart / configuration reload for this controller under some load and the k8s update behaviour did not cause any problems. The app continued to receive traffic and the controller successfully rolled out new pods without any downtime.
With those data points, I feel confident about releasing this feature, probably for 1.4.0, then we can cut a release branch as we discussed in today's community meeting.
@tao12345666333 @rikatz ?
@tao12345666333 @rikatz @strongjz Any clue when would we be able to get this?
Any clue when would we be able to get this?
yes.
@kalindudc @ElvinEfendi Thanks for your help!
I want to take a final look at the code today.
@tao12345666333 Sorry to bug you again about this, but can you please have a look here? We are really stuck because we are not able to use services running more than 1000 pods.
Hi @abhiroop93, just as a curiosity, is it ok for you to describe your use-case that requires over 1000 endpoints per service
@longwuyuan We have heavy throughput on some of our nodejs applications, in which case we have to run more than 1000 pods for those workloads. And since we have migrated to 1.21+, endpoint constructs do not support more than 1000 endpoints.
Sorry, I am indeed reviewing the code. It's a pity I don't have a full time to do it quickly :worried: