kubeapps
kubeapps copied to clipboard
Improve response times in Kubeapps APIs when using Pinniped-proxy
Summary
Some requests (e.g. core.packages.v1alpha1.PackagesService.GetAvailablePackageSummaries
) are slow when using pinniped-proxy
. Investigate and improve those times.
Description
After work done for #5362 and #4919, response times have improved due to not creating unused multiple K8s Go clients.
However, there is still what seems to be a throttling problem when using pinniped-proxy
. Multiple requests are done from kubeapps-apis
to K8s API (with correct QPS and Burst values), but requests go through pinniped-proxy
.
Log from the proxy:
2022-09-28T13:48:38.784778 [INFO] - GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s 200 OK
2022-09-28T13:48:38.787947 [INFO] - GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:38.788917 [INFO] - GET https://kubernetes.default/apis/storage.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.789350 [INFO] - GET https://kubernetes.default/apis/kubeapps.com/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.880150 [INFO] - GET https://kubernetes.default/apis/authorization.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.880827 [INFO] - GET https://kubernetes.default/apis/apiregistration.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.885347 [INFO] - GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-09-28T13:48:38.885365 [INFO] - GET https://kubernetes.default/apis/identity.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.886346 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:38.887521 [INFO] - GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:38.887959 [INFO] - GET https://kubernetes.default/apis/networking.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.888758 [INFO] - GET https://kubernetes.default/apis/coordination.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.889212 [INFO] - GET https://kubernetes.default/apis/events.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.980185 [INFO] - GET https://kubernetes.default/apis/policy/v1?timeout=32s 200 OK
2022-09-28T13:48:38.981594 [INFO] - GET https://kubernetes.default/apis/node.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.982572 [INFO] - GET https://kubernetes.default/api/v1?timeout=32s 200 OK
2022-09-28T13:48:38.984052 [INFO] - GET https://kubernetes.default/apis/batch/v1?timeout=32s 200 OK
2022-09-28T13:48:38.984456 [INFO] - GET https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s 200 OK
2022-09-28T13:48:38.984862 [INFO] - GET https://kubernetes.default/apis/rbac.authorization.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:38.984924 [INFO] - GET https://kubernetes.default/apis/login.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.985329 [INFO] - GET https://kubernetes.default/apis/config.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.985651 [INFO] - GET https://kubernetes.default/apis/authentication.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.986150 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha1?timeout=32s 200 OK
2022-09-28T13:48:38.986452 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha2?timeout=32s 200 OK
2022-09-28T13:48:39.080254 [INFO] - GET https://kubernetes.default/apis/scheduling.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.081032 [INFO] - GET https://kubernetes.default/apis/admissionregistration.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.081128 [INFO] - GET https://kubernetes.default/apis/apps/v1?timeout=32s 200 OK
2022-09-28T13:48:39.083395 [INFO] - GET https://kubernetes.default/apis/apiextensions.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.084938 [INFO] - GET https://kubernetes.default/apis/storage.k8s.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:39.086317 [INFO] - GET https://kubernetes.default/apis/autoscaling/v2?timeout=32s 200 OK
2022-09-28T13:48:39.088025 [INFO] - GET https://kubernetes.default/apis/autoscaling/v1?timeout=32s 200 OK
2022-09-28T13:48:39.180407 [INFO] - GET https://kubernetes.default/apis/notification.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:39.180475 [INFO] - GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-09-28T13:48:39.181346 [INFO] - GET https://kubernetes.default/apis/certificates.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.182219 [INFO] - GET https://kubernetes.default/apis/authentication.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.183452 [INFO] - GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-09-28T13:48:39.186020 [INFO] - GET https://kubernetes.default/apis/discovery.k8s.io/v1?timeout=32s 200 OK
2022-09-28T13:48:39.187938 [INFO] - GET https://kubernetes.default/apis/autoscaling/v2beta2?timeout=32s 200 OK
When kubeapps-apis
is configured without pinniped-proxy
, response times for those requests are way faster. This might indicate that there is a throttling problem from pinniped-proxy
towards the actual K8s API server.
Acceptance criteria
- Response times for Kubeapps APIs is improved when using
pinniped-proxy
., to similar times obtained withoutpinniped-proxy
.
Thanks @castelblanque . Yeah, that's intriguing given that we're caching the credential. I'll take a look.
So I started looking at this today - and it seems quite an interesting problem/bug. For what it's worth, I don't think it's related to throttling of requests to the kubernetes-api server from pinniped-proxy, but rather, an issue in the way async requests are handled by pinniped-proxy
itself.
In summary, tokio
, the asynchronous runtime used with hyper
to provide many concurrent tasks per CPU thread (similar to go-routines, just not built in to the language), creates a new task for each request as it arrives, then marks the task as idle when it is waiting for i/o (in our case, a network request for the proxied response). When the network request is completed, it is marked as runnable again, and should be put into the front of the queue (the next-task slot), but seems to be put on the back of the queue.
Let's see this in the detail: I added a few extra logging statements during the proxy operation so we can tell when the request begins, when the id/cert exchange happens and when the actual downstream request begins (or see this gist for logging with trace of tokio's mio crate )
The first three queries are proxied and returned (last being the /apis) and each takes between 30-40ms:
2022-10-17T03:25:43.657974 [INFO] - Beginning proxy request for https://kubernetes.default/api?timeout=32s
2022-10-17T03:25:43.659170 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.659756 [INFO] - Beginning proxy request for https://kubernetes.default/apis/authorization.k8s.io/v1/selfsubjectaccessreviews
2022-10-17T03:25:43.659807 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.682590 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.687346 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.692050 [INFO] - GET https://kubernetes.default/api?timeout=32s 200 OK
2022-10-17T03:25:43.692896 [INFO] - Beginning proxy request for https://kubernetes.default/apis?timeout=32s
2022-10-17T03:25:43.692962 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.701520 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.702344 [INFO] - POST https://kubernetes.default/apis/authorization.k8s.io/v1/selfsubjectaccessreviews 201 Created
2022-10-17T03:25:43.707072 [INFO] - GET https://kubernetes.default/apis?timeout=32s 200 OK
It is this response which, when returned upstream to the kubeapps-apis
service (which is building the dynamic client and trying to dynamically determine all the available apis as part of the REST mapper creation), the barrage of requests arrives for all the available APIs. The important point is that each request is creating a new task and it appears that all these tasks are being processed FIFO, apparently not concurrently (?!), without allowing any responses back until all the new request tasks have been marked as idle:
2022-10-17T03:25:43.708332 [INFO] - Beginning proxy request for https://kubernetes.default/api/v1?timeout=32s
2022-10-17T03:25:43.708410 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.716813 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.717562 [INFO] - Beginning proxy request for https://kubernetes.default/apis/events.k8s.io/v1?timeout=32s
2022-10-17T03:25:43.717602 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.726343 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.727344 [INFO] - Beginning proxy request for https://kubernetes.default/apis/coordination.k8s.io/v1?timeout=32s
2022-10-17T03:25:43.727406 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.811001 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.812148 [INFO] - Beginning proxy request for https://kubernetes.default/apis/policy/v1?timeout=32s
2022-10-17T03:25:43.812213 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.824686 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.825403 [INFO] - Beginning proxy request for https://kubernetes.default/apis/events.k8s.io/v1beta1?timeout=32s
2022-10-17T03:25:43.825441 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.913587 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.914525 [INFO] - Beginning proxy request for https://kubernetes.default/apis/rbac.authorization.k8s.io/v1?timeout=32s
2022-10-17T03:25:43.914582 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:43.923789 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:43.924695 [INFO] - Beginning proxy request for https://kubernetes.default/apis/login.concierge.pinniped.dev/v1alpha1?timeout=32s
2022-10-17T03:25:43.924740 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.009504 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.010713 [INFO] - Beginning proxy request for https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1beta1?timeout=32s
2022-10-17T03:25:44.010772 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.019752 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.020884 [INFO] - Beginning proxy request for https://kubernetes.default/apis/authentication.concierge.pinniped.dev/v1alpha1?timeout=32s
2022-10-17T03:25:44.020932 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.107160 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.108268 [INFO] - Beginning proxy request for https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s
2022-10-17T03:25:44.108326 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.117917 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.118979 [INFO] - Beginning proxy request for https://kubernetes.default/apis/autoscaling/v2beta2?timeout=32s
2022-10-17T03:25:44.119037 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.204251 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.205376 [INFO] - Beginning proxy request for https://kubernetes.default/apis/networking.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.205427 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.214138 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.215287 [INFO] - Beginning proxy request for https://kubernetes.default/apis/autoscaling/v2beta1?timeout=32s
2022-10-17T03:25:44.215355 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.224067 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.225180 [INFO] - Beginning proxy request for https://kubernetes.default/apis/policy/v1beta1?timeout=32s
2022-10-17T03:25:44.225219 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.309400 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.310438 [INFO] - Beginning proxy request for https://kubernetes.default/apis/config.concierge.pinniped.dev/v1alpha1?timeout=32s
2022-10-17T03:25:44.310487 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.319451 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.320575 [INFO] - Beginning proxy request for https://kubernetes.default/apis/authentication.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.320613 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.405115 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.406333 [INFO] - Beginning proxy request for https://kubernetes.default/apis/identity.concierge.pinniped.dev/v1alpha1?timeout=32s
2022-10-17T03:25:44.406387 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.415010 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.416044 [INFO] - Beginning proxy request for https://kubernetes.default/apis/discovery.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.416089 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.424744 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.426010 [INFO] - Beginning proxy request for https://kubernetes.default/apis/autoscaling/v1?timeout=32s
2022-10-17T03:25:44.501688 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.510537 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.511581 [INFO] - Beginning proxy request for https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha1?timeout=32s
2022-10-17T03:25:44.511631 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.520423 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.521591 [INFO] - Beginning proxy request for https://kubernetes.default/apis/admissionregistration.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.521640 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.606678 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.608375 [INFO] - Beginning proxy request for https://kubernetes.default/apis/apiregistration.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.608436 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.617552 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.618658 [INFO] - Beginning proxy request for https://kubernetes.default/apis/batch/v1?timeout=32s
2022-10-17T03:25:44.618710 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.628218 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.629265 [INFO] - Beginning proxy request for https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s
2022-10-17T03:25:44.629315 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.711106 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.712410 [INFO] - Beginning proxy request for https://kubernetes.default/apis/discovery.k8s.io/v1beta1?timeout=32s
2022-10-17T03:25:44.712458 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.721111 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.801966 [INFO] - Beginning proxy request for https://kubernetes.default/apis/node.k8s.io/v1beta1?timeout=32s
2022-10-17T03:25:44.802032 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.811083 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.812333 [INFO] - Beginning proxy request for https://kubernetes.default/apis/node.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.812383 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.821086 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.822338 [INFO] - Beginning proxy request for https://kubernetes.default/apis/storage.k8s.io/v1beta1?timeout=32s
2022-10-17T03:25:44.822397 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.907995 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.909164 [INFO] - Beginning proxy request for https://kubernetes.default/apis/apps/v1?timeout=32s
2022-10-17T03:25:44.909213 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:44.917759 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:44.918973 [INFO] - Beginning proxy request for https://kubernetes.default/apis/scheduling.k8s.io/v1?timeout=32s
2022-10-17T03:25:44.919030 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.005963 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.007736 [INFO] - Beginning proxy request for https://kubernetes.default/apis/autoscaling/v2?timeout=32s
2022-10-17T03:25:45.007832 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.020117 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.021137 [INFO] - Beginning proxy request for https://kubernetes.default/apis/authorization.k8s.io/v1?timeout=32s
2022-10-17T03:25:45.021178 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.105791 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.107111 [INFO] - Beginning proxy request for https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta1?timeout=32s
2022-10-17T03:25:45.107169 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.115979 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.117122 [INFO] - Beginning proxy request for https://kubernetes.default/apis/storage.k8s.io/v1?timeout=32s
2022-10-17T03:25:45.117171 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.205532 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.207104 [INFO] - Beginning proxy request for https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s
2022-10-17T03:25:45.207180 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.222913 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.224620 [INFO] - Beginning proxy request for https://kubernetes.default/apis/certificates.k8s.io/v1?timeout=32s
2022-10-17T03:25:45.224708 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.308181 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.309547 [INFO] - Beginning proxy request for https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta1?timeout=32s
2022-10-17T03:25:45.309600 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.318831 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.320041 [INFO] - Beginning proxy request for https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta2?timeout=32s
2022-10-17T03:25:45.320090 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.405111 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.406595 [INFO] - Beginning proxy request for https://kubernetes.default/apis/notification.toolkit.fluxcd.io/v1beta1?timeout=32s
2022-10-17T03:25:45.406644 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.415655 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.416844 [INFO] - Beginning proxy request for https://kubernetes.default/apis/batch/v1beta1?timeout=32s
2022-10-17T03:25:45.416886 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.425878 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.502684 [INFO] - Beginning proxy request for https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha2?timeout=32s
2022-10-17T03:25:45.502744 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.512051 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.513227 [INFO] - Beginning proxy request for https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta2?timeout=32s
2022-10-17T03:25:45.513272 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.525748 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.602727 [INFO] - Beginning proxy request for https://kubernetes.default/apis/apiextensions.k8s.io/v1?timeout=32s
2022-10-17T03:25:45.602787 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.612447 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:45.613777 [INFO] - Beginning proxy request for https://kubernetes.default/apis/kubeapps.com/v1alpha1?timeout=32s
2022-10-17T03:25:45.613831 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:45.623089 [INFO] - Requesting k8s API endpoint with cert
So that began at 43.708, with the "Beginning - Exchanging - Requesting" for each request taking only around 10ms, but no requests are completed until all the concurrent incoming requests have been processed. So at 45.718, a full 2s later, the requests are all returned by the proxy, until 45.902, so over a period of 200ms.
2022-10-17T03:25:45.718840 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.724800 [INFO] - GET https://kubernetes.default/api/v1?timeout=32s 200 OK
2022-10-17T03:25:45.734030 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.804941 [INFO] - GET https://kubernetes.default/apis/apiextensions.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.810390 [INFO] - GET https://kubernetes.default/apis/networking.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.811114 [INFO] - GET https://kubernetes.default/apis/policy/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.813519 [INFO] - GET https://kubernetes.default/apis/autoscaling/v2?timeout=32s 200 OK
2022-10-17T03:25:45.814883 [INFO] - GET https://kubernetes.default/apis/discovery.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.816179 [INFO] - GET https://kubernetes.default/apis/autoscaling/v1?timeout=32s 200 OK
2022-10-17T03:25:45.816962 [INFO] - GET https://kubernetes.default/apis/authentication.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.817745 [INFO] - GET https://kubernetes.default/apis/autoscaling/v2beta1?timeout=32s 200 OK
2022-10-17T03:25:45.818542 [INFO] - GET https://kubernetes.default/apis/autoscaling/v2beta2?timeout=32s 200 OK
2022-10-17T03:25:45.819341 [INFO] - GET https://kubernetes.default/apis/config.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.820121 [INFO] - GET https://kubernetes.default/apis/identity.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.820910 [INFO] - GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.821708 [INFO] - GET https://kubernetes.default/apis/kubeapps.com/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.822515 [INFO] - GET https://kubernetes.default/apis/authentication.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.822631 [INFO] - GET https://kubernetes.default/apis/certificates.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.823474 [INFO] - GET https://kubernetes.default/apis/storage.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.824299 [INFO] - GET https://kubernetes.default/apis/authorization.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.825144 [INFO] - GET https://kubernetes.default/apis/storage.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.825296 [INFO] - GET https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s 200 OK
2022-10-17T03:25:45.826143 [INFO] - GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s 200 OK
2022-10-17T03:25:45.826982 [INFO] - GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-10-17T03:25:45.829095 [INFO] - GET https://kubernetes.default/apis/policy/v1?timeout=32s 200 OK
2022-10-17T03:25:45.829914 [INFO] - GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.830026 [INFO] - GET https://kubernetes.default/apis/discovery.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.902639 [INFO] - GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha2?timeout=32s 200 OK
2022-10-17T03:25:45.902752 [INFO] - GET https://kubernetes.default/apis/apps/v1?timeout=32s 200 OK
2022-10-17T03:25:45.902949 [INFO] - GET https://kubernetes.default/apis/admissionregistration.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.904062 [INFO] - GET https://kubernetes.default/apis/rbac.authorization.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.904242 [INFO] - GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-10-17T03:25:45.904390 [INFO] - GET https://kubernetes.default/apis/coordination.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.905383 [INFO] - GET https://kubernetes.default/apis/events.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.906250 [INFO] - GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.907137 [INFO] - GET https://kubernetes.default/apis/scheduling.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.908026 [INFO] - GET https://kubernetes.default/apis/events.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.908946 [INFO] - GET https://kubernetes.default/apis/node.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.909809 [INFO] - GET https://kubernetes.default/apis/apiregistration.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.910668 [INFO] - GET https://kubernetes.default/apis/login.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-17T03:25:45.911502 [INFO] - GET https://kubernetes.default/apis/node.k8s.io/v1?timeout=32s 200 OK
2022-10-17T03:25:45.916386 [INFO] - GET https://kubernetes.default/apis/batch/v1?timeout=32s 200 OK
2022-10-17T03:25:45.917290 [INFO] - GET https://kubernetes.default/apis/notification.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.917425 [INFO] - GET https://kubernetes.default/apis/batch/v1beta1?timeout=32s 200 OK
2022-10-17T03:25:45.919930 [INFO] - Beginning proxy request for https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1/namespaces/kubeapps/helmreleases
2022-10-17T03:25:45.919980 [INFO] - Exchanging id_token for X509 client identity using pinniped
2022-10-17T03:25:46.004163 [INFO] - Requesting k8s API endpoint with cert
2022-10-17T03:25:46.009001 [INFO] - GET https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1/namespaces/kubeapps/helmreleases 200 OK
Read in-depth overview of the tokio scheduler. In particular the addition of the next_task
slot sounds like a solution to this exact problem - but I'm not yet sure why we're not seeing the desired behaviour? (ie. when a proxied request is complete, the task should be marked runnable and put in the next_task
slot so it can jump the queue and be returned, rather than continuing with the next incoming requests from the queue.
It'll probably be a simple configuration, but I'll continue tomorrow.
Interesting findings!
When the network request is completed, it is marked as runnable again, and should be put into the front of the queue (the next-task slot), but seems to be put on the back of the queue.
Is that an actual bug in tokio?
In the Java world, sometimes this kind of problems occur when the pool of threads is too small. But if no requests are completed until all the concurrent incoming requests have been processed
, it does not seem like related to the pool size.
Is that an actual bug in tokio?
I doubt it, more likely just a config option or something I've missed, but I'll hopefully find out tomorrow.
In the Java world, sometimes this kind of problems occur when the pool of threads is too small. But if
no requests are completed until all the concurrent incoming requests have been processed
, it does not seem like related to the pool size.
Yeah - all incoming requests get processed before it moves on to return the results for each, though it confuses me that they seem to be done serially. Interesting anyway :)
So, summary of findings after creating a minimal working example:
- There is no issue in tokio (as expected). It is correct that it processes I/O events in a FIFO manner
- The issue is a combination of a CPU-intensive calculation in the request (I suspect calculating hashes to lookup the cached credential, but will verify) and CPU resource starvation for the service itself, so that actually initiating the requests takes many seconds.
Using my minimal working example, I can reproduce the results seen by explicitly:
- Adding a 100ms (or otherwise) CPU-blocking call right before each request is made, and
- Restricting the number of CPUs available to the runtime.
If only one of these is done, the queries can still execute quickly.
I'll profile the credential cache access used in pinniped-proxy next.
...and found it: 2 things to note in the logs below:
- The request begins, but after my log message of "Exchanging id_token for..." there must be an unexpected I/O event that causes the task to go idle (even though it should be a cache request) - hence all requests get started before any request actually finishes the exchange and begins "Requesting k8s API endpoint with cert",
- Perhaps related, but the calls to
call_pinniped
are apparently stacking (perhaps because of thesync_writes
option I'm using on the cache), starting at 15ms and just growing from there as time passes up to almost 2s per request.
Finally I can now actually try to fix the issue :)
2022-10-19T06:02:32.757Z INFO pinniped_proxy::service > Beginning proxy request for https://kubernetes.default/apis/autoscaling/v2beta1?timeout=32s
2022-10-19T06:02:32.757Z INFO pinniped_proxy::service > Exchanging id_token for X509 client identity using pinniped
2022-10-19T06:02:32.757Z INFO pinniped_proxy::service > Beginning proxy request for https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta2?timeout=32s
2022-10-19T06:02:32.757Z INFO pinniped_proxy::service > Exchanging id_token for X509 client identity using pinniped
2022-10-19T06:02:32.760Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 16ms, of which 15ms was call_pinniped
2022-10-19T06:02:32.765Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:32.765Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 20ms, of which 20ms was call_pinniped
2022-10-19T06:02:32.835Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:32.836Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 91ms, of which 91ms was call_pinniped
2022-10-19T06:02:32.860Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:32.860Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 115ms, of which 115ms was call_pinniped
2022-10-19T06:02:32.945Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:32.945Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 199ms, of which 199ms was call_pinniped
2022-10-19T06:02:32.953Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:32.953Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 207ms, of which 207ms was call_pinniped
2022-10-19T06:02:33.039Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.039Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 293ms, of which 293ms was call_pinniped
2022-10-19T06:02:33.135Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.135Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 388ms, of which 388ms was call_pinniped
2022-10-19T06:02:33.157Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.157Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 411ms, of which 411ms was call_pinniped
2022-10-19T06:02:33.255Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.255Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 508ms, of which 508ms was call_pinniped
2022-10-19T06:02:33.355Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.355Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 608ms, of which 607ms was call_pinniped
2022-10-19T06:02:33.438Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.438Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 691ms, of which 691ms was call_pinniped
2022-10-19T06:02:33.444Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.444Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 696ms, of which 696ms was call_pinniped
2022-10-19T06:02:33.448Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.448Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 700ms, of which 700ms was call_pinniped
2022-10-19T06:02:33.454Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.454Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 705ms, of which 705ms was call_pinniped
2022-10-19T06:02:33.539Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.539Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 790ms, of which 790ms was call_pinniped
2022-10-19T06:02:33.636Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.636Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 887ms, of which 887ms was call_pinniped
2022-10-19T06:02:33.653Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.653Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 904ms, of which 903ms was call_pinniped
2022-10-19T06:02:33.760Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.760Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1011ms, of which 1010ms was call_pinniped
2022-10-19T06:02:33.837Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.838Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1088ms, of which 1088ms was call_pinniped
2022-10-19T06:02:33.842Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.842Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1092ms, of which 1092ms was call_pinniped
2022-10-19T06:02:33.847Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.847Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1097ms, of which 1097ms was call_pinniped
2022-10-19T06:02:33.852Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.852Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1101ms, of which 1101ms was call_pinniped
2022-10-19T06:02:33.947Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:33.948Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1197ms, of which 1197ms was call_pinniped
2022-10-19T06:02:34.035Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.035Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1284ms, of which 1283ms was call_pinniped
2022-10-19T06:02:34.133Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.134Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1382ms, of which 1382ms was call_pinniped
2022-10-19T06:02:34.157Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.157Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1405ms, of which 1405ms was call_pinniped
2022-10-19T06:02:34.238Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.238Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1487ms, of which 1486ms was call_pinniped
2022-10-19T06:02:34.243Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.243Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1491ms, of which 1491ms was call_pinniped
2022-10-19T06:02:34.248Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.248Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1495ms, of which 1495ms was call_pinniped
2022-10-19T06:02:34.253Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.253Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1500ms, of which 1499ms was call_pinniped
2022-10-19T06:02:34.335Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.336Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1582ms, of which 1582ms was call_pinniped
2022-10-19T06:02:34.435Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.435Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1681ms, of which 1680ms was call_pinniped
2022-10-19T06:02:34.452Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.452Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1697ms, of which 1697ms was call_pinniped
2022-10-19T06:02:34.535Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.535Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1780ms, of which 1780ms was call_pinniped
2022-10-19T06:02:34.542Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.542Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1787ms, of which 1787ms was call_pinniped
2022-10-19T06:02:34.549Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.549Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1793ms, of which 1793ms was call_pinniped
2022-10-19T06:02:34.634Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.634Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1878ms, of which 1878ms was call_pinniped
2022-10-19T06:02:34.640Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.640Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1883ms, of which 1883ms was call_pinniped
2022-10-19T06:02:34.645Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.645Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1888ms, of which 1888ms was call_pinniped
2022-10-19T06:02:34.651Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.651Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1894ms, of which 1893ms was call_pinniped
2022-10-19T06:02:34.657Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.657Z INFO pinniped_proxy::pinniped > prep-and-call-pinniped took: 1899ms, of which 1899ms was call_pinniped
2022-10-19T06:02:34.752Z INFO pinniped_proxy::service > Requesting k8s API endpoint with cert
2022-10-19T06:02:34.941Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-10-19T06:02:34.943Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/node.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:34.946Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/config.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-19T06:02:34.948Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/autoscaling/v2beta2?timeout=32s 200 OK
2022-10-19T06:02:35.036Z INFO pinniped_proxy::service > GET https://kubernetes.default/api/v1?timeout=32s 200 OK
2022-10-19T06:02:35.037Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/login.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-19T06:02:35.040Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/storage.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.043Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/autoscaling/v1?timeout=32s 200 OK
2022-10-19T06:02:35.046Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/storage.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.046Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/kubeapps.com/v1alpha1?timeout=32s 200 OK
2022-10-19T06:02:35.049Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha2?timeout=32s 200 OK
2022-10-19T06:02:35.050Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/autoscaling/v2?timeout=32s 200 OK
2022-10-19T06:02:35.052Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/networking.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.053Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/policy/v1?timeout=32s 200 OK
2022-10-19T06:02:35.056Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/authorization.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.135Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/authentication.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-19T06:02:35.136Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/policy/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.139Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/apiextensions.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.142Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta2?timeout=32s 200 OK
2022-10-19T06:02:35.145Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/batch/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.148Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/autoscaling/v2beta1?timeout=32s 200 OK
2022-10-19T06:02:35.152Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/events.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.155Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/discovery.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.234Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/notification.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.234Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/events.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.252Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.252Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/source.toolkit.fluxcd.io/v1beta2?timeout=32s 200 OK
2022-10-19T06:02:35.252Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.253Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/admissionregistration.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.257Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/authentication.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.335Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/batch/v1?timeout=32s 200 OK
2022-10-19T06:02:35.338Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/kustomize.toolkit.fluxcd.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.341Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/identity.concierge.pinniped.dev/v1alpha1?timeout=32s 200 OK
2022-10-19T06:02:35.344Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/helm.toolkit.fluxcd.io/v2beta1?timeout=32s 200 OK
2022-10-19T06:02:35.347Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/certificates.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.350Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/rbac.authorization.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.351Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/coordination.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.354Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/discovery.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.357Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/apiregistration.k8s.io/v1?timeout=32s 200 OK
2022-10-19T06:02:35.434Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/node.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.435Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/flowcontrol.apiserver.k8s.io/v1beta1?timeout=32s 200 OK
2022-10-19T06:02:35.436Z INFO pinniped_proxy::service > GET https://kubernetes.default/apis/image.toolkit.fluxcd.io/v1alpha1?timeout=32s 200 OK
And the winner is the sync_writes
option to the cached macro: I had misunderstood what it does, turns out it actually causes each call to try to acquire a write-lock before executing the function, so the first query gets the write-lock, then every other following query gets I/O blocked at this point while trying to get the credential until the previous one releases it.
Basically forming a queue to sequentially get the credential before each can make the proxied request.
Without sync_writes = true
it works as expected.
And the winner is the sync_writes option to the cached macro
Awesome work Michael. So the write-lock was blocking all subsequent requests.
This is something that still amazes me from programming: such a simple fix, but soo hard to pinpoint 😄
Awesome finding Michael!
Thanks - it was a lot of fun to track down (good chance to learn quite a bit). I just had to share here because I'm working all on my own in my TZ with no-one to high-five with :)
Just an update on this: as mentioned on the linked PR, the issue that I found last week (mentioned above) fixes one aspect, but the requests still takes way too long. I've spent the past couple of days digging to find the issue and finding that
- the call to the actual cached function is sometimes a few ms, other times 100ms:
2022-10-23T22:31:22.999358 [INFO] - prep_and_call took 4ms, 4 of which was call_pinniped
2022-10-23T22:31:23.098464 [INFO] - prep_and_call took 94ms, 94 of which was call_pinniped
2022-10-23T22:31:23.292432 [INFO] - prep_and_call took 96ms, 96 of which was call_pinniped
2022-10-23T22:31:23.484841 [INFO] - prep_and_call took 94ms, 94 of which was call_pinniped
2022-10-23T22:31:23.600418 [INFO] - prep_and_call took 18ms, 18 of which was call_pinniped
2022-10-23T22:31:23.782121 [INFO] - prep_and_call took 93ms, 93 of which was call_pinniped
2022-10-23T22:31:23.792623 [INFO] - prep_and_call took 4ms, 4 of which was call_pinniped
- The actual cache lookup is less than a ms, consistently.
- The longer durations appear to be CPU starvation - but why?
I didn't have much success finding out how the above could be true until this morning, when realising that because the cached function call_pinniped
is called asynchronously, regardless of whether the result is cached or not. So even though the value is cached after the first call, every request going through pinniped gets slept when .await
is called (it's an i/o wait), with the task manager switching to a different task (rather than just pulling the value out of the cache) and so is dependent on the task being enabled again by the task manager.
At least, this is what I will next check, though I need to take a break from this to get some other things done (hence dumping thoughts here). The approach will be, rather than decorating the async function with the cached
macro, instead, having a cache available and only calling the async function on a cache miss. This should avoid sleeping the task unnecessarily.