aws-cloud-map-mcs-controller-for-k8s icon indicating copy to clipboard operation
aws-cloud-map-mcs-controller-for-k8s copied to clipboard

Cloud Map API Throttling

Open alfianabdi opened this issue 2 years ago • 7 comments

I keep getting API error from MCS cloud map controller

{"level":"error","ts":1649927224.8055146,"logger":"controllers.Cloudmap","msg":"Cloud Map reconciliation error","error":"operation error ServiceDiscovery: ListServices, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 81d0ef9f-ed2b-49d5-98ee-b3b1f6934f7f, api error ThrottlingException: Rate exceeded","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:43\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}

How can we configure the Cloud map controller to reduce the API call ?

Thanks

alfianabdi avatar Apr 14 '22 09:04 alfianabdi

Hello @alfianabdi - We have PR to reduce number of API calls in the main branch. https://github.com/aws/aws-cloud-map-mcs-controller-for-k8s/commit/97072a6ae7fb8cb7af2a463ee1e4ae9a44ccf6ee. We will release patch release soon.

runakash avatar Apr 18 '22 20:04 runakash

Hello @alfianabdi - We have released #129. Can you retry?

runakash avatar Apr 20 '22 02:04 runakash

Hi @runakash

It was ok at that time since there are only few ServiceExports, but now I am getting throttled again. kc get serviceexports.multicluster.x-k8s.io -A | wc -l 29

Here some logs

{"level":"error","ts":1661336067.2116988,"logger":"controllers.Cloudmap","msg":"Cloud Map reconciliation error","error":"operation error ServiceDiscovery: ListServices, retry quota exceeded, 2 available, 5 requested","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:43\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}
{"level":"error","ts":1661336071.7246952,"logger":"controllers.Cloudmap","msg":"failed to fetch the list Services","error":"operation error ServiceDiscovery: ListServices, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 217341af-6d1d-4b27-89ff-d3482e7f84c3, api error ThrottlingException: Rate exceeded","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).reconcileNamespace\n\t/workspace/pkg/controllers/cloudmap_controller.go:76\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Reconcile\n\t/workspace/pkg/controllers/cloudmap_controller.go:63\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:41\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}
{"level":"error","ts":1661336071.7247527,"logger":"controllers.Cloudmap","msg":"Cloud Map reconciliation error","error":"operation error ServiceDiscovery: ListServices, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: 217341af-6d1d-4b27-89ff-d3482e7f84c3, api error ThrottlingException: Rate exceeded","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:43\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}
{"level":"error","ts":1661336075.475138,"logger":"controllers.Cloudmap","msg":"failed to fetch the list Services","error":"operation error ServiceDiscovery: ListServices, retry quota exceeded, 3 available, 5 requested","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).reconcileNamespace\n\t/workspace/pkg/controllers/cloudmap_controller.go:76\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Reconcile\n\t/workspace/pkg/controllers/cloudmap_controller.go:63\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:41\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}
{"level":"error","ts":1661336075.4751897,"logger":"controllers.Cloudmap","msg":"Cloud Map reconciliation error","error":"operation error ServiceDiscovery: ListServices, retry quota exceeded, 3 available, 5 requested","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:132\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/common.logger.Error\n\t/workspace/pkg/common/logger.go:39\ngithub.com/aws/aws-cloud-map-mcs-controller-for-k8s/pkg/controllers.(*CloudMapReconciler).Start\n\t/workspace/pkg/controllers/cloudmap_controller.go:43\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).startRunnable.func1\n\t/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/manager/internal.go:681"}

alfianabdi avatar Aug 24 '22 10:08 alfianabdi

Hey @alfianabdi - Thanks for reporting! How many clusters do you have in your setup?

Also, which version of the controller have you installed?

runakash avatar Aug 24 '22 16:08 runakash

We filter ListServices calls by namespace ID, so services exported across many namespaces in a clusterset will cause cache misses and throttling, even in a single cluster.

astaticvoid avatar Aug 24 '22 18:08 astaticvoid

We can change the key on the service id cache to namespace + service name, and filter the ListServices call against the relevant namespace IDs on reconcile. As a workaround in the meantime until this is implemented and released is it possible to combine/reduce the number of namespaces that are being exported?

astaticvoid avatar Aug 24 '22 19:08 astaticvoid

@runakash

Currently only two clusters. All those ServiceExports are from different namespace.

@astaticvoid

Thanks for the info, I will try to reduce the number of namespace until the fix is released.

alfianabdi avatar Aug 25 '22 01:08 alfianabdi

I am facing the exact issue. In my case, there is only 1 namespace and around 600 serviceexports. What is a recommended workaround for this? {"level":"error","ts":1667198035.2729692,"logger":"controllers.ServiceExport","msg":"error deleting Endpoints from Cloud Map","namespace":"test","name":"testappv2","error":"operation error ServiceDiscovery: ListOperations, exceeded maximum number of attempts, 3, https response error StatusCode: 400, RequestID: xxxx, api error ThrottlingException: Rate exceeded"}

sujai-sivasamy avatar Nov 03 '22 05:11 sujai-sivasamy

Hello sujai-sivasamy, we are working on optimizations, will let you know once we have a new version out. We don't have an ETA yet.

runakash avatar Nov 03 '22 05:11 runakash

We released new version which should improve with the api throttling https://github.com/aws/aws-cloud-map-mcs-controller-for-k8s/releases/tag/v0.3.1 Feel free to let us know, if any new errors are encountered.

runakash avatar Dec 12 '22 23:12 runakash