helm-controller icon indicating copy to clipboard operation
helm-controller copied to clipboard

Lots of client-side throttling on clusters with Azure Service Operator

Open masterphenix opened this issue 2 years ago • 1 comments

Hello, We have a huge amount of client-side throttling errors showing in the Helm Controller, on a cluster that has 150 CRDs, most of them deployed by Azure Service Operator (which is an equivalent of Crossplane, but specifically for Azure). In fact, the controller is so much throttled that reconciling helmreleases fails with this error:

✗ client rate limiter Wait returned an error: context deadline exceeded

Example of client-side throttling errors:

I1124 09:38:12.606756       7 request.go:682] Waited for 1.014627573s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
I1124 09:38:23.277039       7 request.go:682] Waited for 1.014878006s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/signalrservice.azure.com/v1alpha1api20211001?timeout=32s
I1124 09:38:33.700502       7 request.go:682] Waited for 1.004668059s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/rabbitmq.com/v1alpha1?timeout=32s
I1124 09:38:43.937738       7 request.go:682] Waited for 1.007247339s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/dbforpostgresql.azure.com/v1alpha1api20210601storage?timeout=32s
I1124 09:38:54.428014       7 request.go:682] Waited for 1.015669903s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/authorization.azure.com/v1beta20200801preview?timeout=32s
I1124 09:39:04.657925       7 request.go:682] Waited for 1.000940206s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/policy/v1beta1?timeout=32s
I1124 09:39:15.203590       7 request.go:682] Waited for 1.009544821s due to client-side throttling, not priority and fairness, request: GET:https://10.200.0.1:443/apis/compute.azure.com/v1beta20201201storage?timeout=32s

Version of the controller is v0.26.0

We do not encounter such behavior on clusters with less CRDs (without Azure Service Operator).

masterphenix avatar Nov 24 '22 09:11 masterphenix

Same here. In my case, with crossplane CRD... same controller version 0.26.0. Same flux - helm-controller configuration without crossplane CRDs works fine.

It causes issues on the helm-controller pods, so it stops it and can not run for more than a few seconds.

Why does this controller need to query other than Helm-related CRD?

angelbarrera92 avatar Nov 30 '22 11:11 angelbarrera92