weave-gitops
weave-gitops copied to clipboard
Improve multi-cluster querying performance
Requests for resources across clusters can take upwards of 9 seconds on our demo clusters, and we have reports of requests taking minutes in customer environments.
The original doc that outlines the work on multi-cluster querying/architecture: https://docs.google.com/document/d/1kxcyznya57gAW28_jUU4ZQVffAyrSze7Py_y0T_z6jE/edit#heading=h.lx604oyg19lq
Some areas for investigation:
- [ ] Cache kubernetes clients
- [ ] Paginate requests
- [ ] Fail fast when a cluster is unreachable
- [ ] Add tracing
- [ ] Add metrics
- [ ] Ensure namespace-access requests are cached where available
- [ ] Ensure requests that can be made concurrently are concurrent
@jpellizzari @luizbafilho would it be possible to add to the epic the links of the pieces of evidence that you have around this epic?
Could you be a bit more specific? it's not clear to me what evidence you want.
Could you be a bit more specific? it's not clear to me what evidence you want.
Sure sorry, I mean the evidence for the performance tests that you have done during the analysis of the problem to for example understand that non-caching clients was impacting in % of the time for a given request in a given scenario.
I haven't done it myself so I was trying to characterize the spent time by concern.
It just came to mind while reviewing the caching PR 😓
I don't have anything, it was the code itself, and by checking how k8s clients work.
I don't have anything, it was the code itself, and by checking how k8s clients work.
Ahh no probs, just was in case you had it to reuse it in my own benefit
@jpellizzari @enekofb can I close this issue? I'm assuming this work is tracked in a different issue.
Closing in favor of the following
Being worked as part of Weave Gitops Enterprise under the following initiative and Issue