goldilocks getAllTopControllers generate too much load on api server

getAllTopControllers generate too much load on api server

Open max0ne opened this issue 1 year ago • 0 comments

What happened?

When Goldilocks performs reconcile on each namespace, over any pod change, it performs a list on all pods of that namespace, this generates a significant amount of load on apiserver. If a deployment rollout happens where all old pods are removed and replaced with new pods, this generates O(n^2) amount of load on apiserver, with n being number of pods.

This amount of load is not acceptable in larger size clusters, we've seen similar problems with using vector https://github.com/vectordotdev/vector/issues/16798 and it brought down our apiserver

This problem is also mentioned in this issue but not fully addressed: https://github.com/FairwindsOps/goldilocks/issues/536

What did you expect to happen?

A potential solution: Goldilocks can only list apps/v1/deployments and some other known top controller types, this list can be specified through config

How can we reproduce this?

N/A

Version

any version after 4.0.0

Search

[X] I did search for other open and closed issues before opening this.

Code of Conduct

[X] I agree to follow this project's Code of Conduct

Additional context

No response

Jan 18 '24 23:01 max0ne

goldilocks goldilocks copied to clipboard

getAllTopControllers generate too much load on api server

What happened?

What did you expect to happen?

How can we reproduce this?

Version

Search

Code of Conduct

Additional context

goldilocks
goldilocks copied to clipboard