cloud-provider icon indicating copy to clipboard operation
cloud-provider copied to clipboard

Investigate API Throttling in Node Controller

Open andrewsykim opened this issue 5 years ago • 13 comments

ref: https://github.com/kubernetes/kubernetes/issues/75016

For large clusters, we're seeing API throttling from providers becoming more common. Taking node-controller as an example, it will call a "get node" api request per node on every sync loop. For a 1000 node cluster that's could be 1000 get requests per minute which can result in users running out of API quotas.

andrewsykim avatar Mar 06 '19 17:03 andrewsykim

/assign @aoxn @andrewsykim

andrewsykim avatar Mar 06 '19 17:03 andrewsykim

SIG OpenStack has seen similar issues, but we're addressing in our own implementations. We would welcome a generic solution if one would be appropriate, but we're fine with adding cloud-specific tuning variables on our provider.

hogepodge avatar Mar 20 '19 20:03 hogepodge

@hogepodge Cool work. Could you share your solutions with us?

aoxn avatar Mar 21 '19 01:03 aoxn

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Jun 26 '19 19:06 fejta-bot

/remove-lifecycle stale

feiskyer avatar Jun 27 '19 06:06 feiskyer

Investigate severity of API throttling for v1.16

andrewsykim avatar Jul 10 '19 20:07 andrewsykim

If folks have user stories on how bad the API throttling is for clusters that would be really helpful in determining the priority for this issue.

andrewsykim avatar Jul 10 '19 21:07 andrewsykim

This would be the issue for Azure. Refer #30.

feiskyer avatar Jul 11 '19 08:07 feiskyer

Issues go stale after 90d of inactivity. Mark the issue as fresh with /remove-lifecycle stale. Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta. /lifecycle stale

fejta-bot avatar Dec 31 '19 21:12 fejta-bot

/remove-lifecycle stale

cheftako avatar Jan 02 '20 23:01 cheftako

/lifecycle frozen

cheftako avatar Jan 02 '20 23:01 cheftako

/help

andrewsykim avatar Apr 15 '20 20:04 andrewsykim

@andrewsykim: This request has been marked as needing help from a contributor.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed by commenting with the /remove-help command.

In response to this:

/help

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot avatar Apr 15 '20 20:04 k8s-ci-robot