kubeapps Requests for catalog delayed

Describe the bug While testing #4918 , I noticed an oddity that is causing requests for GetAvailablePackageSummaries to be delayed significantly (up to 16s sometimes:

GetAvailablePackageSummaries-long

While watching the logs for pinniped proxy, I see the delay is because every request is creating a new REST API mapper and I suspect it's being throttled. What I see is that each paginated request results in a bunch of requests to the APi server:

INFO  pinniped_proxy::expired_value_cache > cache hit
 INFO  pinniped_proxy::expired_value_cache > cache hit
 INFO  pinniped_proxy::expired_value_cache > cache hit
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/autoscaling/v2beta2?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/scheduling.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/batch/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/kubeapps.com/v1alpha1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/authentication.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/autoscaling/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/apps/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/discovery.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/events.k8s.io/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/coordination.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/batch/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/autoscaling/v2?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/node.k8s.io/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/storage.k8s.io/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/node.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/apiregistration.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/certificates.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/events.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/api/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/storage.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/authorization.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/discovery.k8s.io/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/policy/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/apiextensions.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/admissionregistration.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/config.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/identity.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/networking.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/policy/v1beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/autoscaling/v2beta1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/authentication.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/rbac.authorization.k8s.io/v1?timeout=32s 200 OK
 INFO  pinniped_proxy::service             > GET https://kubernetes.default/apis/login.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
 INFO  pinniped_proxy::expired_value_cache > cache hit
 INFO  pinniped_proxy::service             > POST https://kubernetes.default/apis/authorization.k8s.io/v1/selfsubjectaccessreviews 201 Created

and for each paginated request, it takes longer.

To Reproduce Steps to reproduce the behavior:

install kubeapps with pinniped setup
Follow logs for the pinniped-proxy container
Access the catalog (when the bitnami repo is installed as it is by default)
See error

Expected behavior Each page should be returned promptly, and there should not be multiple requests to generate the rest mapper (perhaps we need to cache this per token, not sure yet).

Jun 14 '22 13:06 absoludity

Need to investigate whether this is just affecting the pinniped setup (I assume so, since I've not seen it before and have been testing this pagination quite a bit lately), and if so, why.

Jun 14 '22 13:06 absoludity

There is no throttling between kubeapps-apis and pinniped-proxy, as all clients count with qps/burst configuration (at least since #5379). Same behavior as kubeapps-apis would have against directly k8s API server.

If there is a throttling problem, it must be at pinniped-proxy when proxying against k8s API server. Maybe a connection pool limitation? Hyper does not allow to specify the size of the connection pool. Only parameters allowed are pool_idle_timeoutand pool_max_idle_per_host.

When creating a K8s dynamic client from Go, many requests are done at the same time against K8s API server for retrieving resources. The log in pinniped-proxy for one example request is then:

2022-09-22T13:15:27.638240 [INFO] - Starts proxying the request https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s
...
2022-09-22T13:15:30.410384 [INFO] - GET https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s 200 OK

Please notice that almost 3 seconds have passed in proxying the request to K8s API server. Many more requests are going on from pinniped-proxy, so the K8s API server could be very busy at that point.

A way to improve further the performance would be to investigate if increasing the connection pool for Hyper helps.

I did a quick test with a script that invokes 39 different URLs of resources to pinniped-proxy, the same way as creating a dynamic client does from Go. When done one right after the other, elapsed ~2 seconds. If done in parallel, twice the time. I'll run the script against K8s API server directly to compare.

All above assumes that the problem is when Rust acts as a client for K8s API server, but it could also be that the problem is with Rust not being able to serve so many concurrent requests.

Sep 22 '22 15:09 castelblanque

kubeapps kubeapps copied to clipboard

Requests for catalog delayed

kubeapps
kubeapps copied to clipboard