Helm Dashboard Slow Performance
Description
Hey,
We recently implemented helm-dashboard in our dev cluster which has hundreds of ns's and helm charts. Everything fine except the performance of helm-dashboard UI. It might take a minutes to load a chart list or show the manifests of specific chart.
In the helm-dashboard log we see that simple api requests takes tens of seconds or even minutes to complete
[GIN] 2024/06/10 - 12:47:33 | 200 | 2m28s | 10.162.215.11 | GET "/api/helm/releases/swarm-gateway-stage/swarm-gateway-stage/resources?health=true"
[GIN] 2024/06/10 - 12:47:35 | 200 | 2m29s | 10.162.215.11 | GET "/api/helm/releases/static-site-storage-cleanup-jobs-dev/static-site-storage-cleanup-jobs-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:37 | 200 | 2m31s | 10.162.215.11 | GET "/api/helm/releases/quick-search-stage/quick-search-stage/resources?health=true"
[GIN] 2024/06/10 - 12:47:38 | 200 | 2m33s | 10.162.215.11 | GET "/api/helm/releases/pdf-composer-dev/pdf-composer-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:39 | 200 | 37.05µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:47:39 | 200 | 41.748µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:47:40 | 200 | 2m34s | 10.162.215.11 | GET "/api/helm/releases/vehicle-brand-model-sync-dev/vehicle-brand-model-sync-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:42 | 200 | 2m36s | 10.162.215.11 | GET "/api/helm/releases/secret-manager-stage/secret-manager-stage/resources?health=true"
[GIN] 2024/06/10 - 12:47:43 | 200 | 2m38s | 10.162.215.11 | GET "/api/helm/releases/vehicle-brand-model-sync-stage/vehicle-brand-model-sync-stage/resources?health=true"
[GIN] 2024/06/10 - 12:47:45 | 200 | 2m39s | 10.162.215.11 | GET "/api/helm/releases/pricing-engine-job-dev/pricing-engine-job-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:47 | 200 | 2m41s | 10.162.215.11 | GET "/api/helm/releases/pricing-engine-stage/pricing-engine-stage/resources?health=true"
[GIN] 2024/06/10 - 12:47:48 | 200 | 2m43s | 10.162.215.11 | GET "/api/helm/releases/motor-registry-lv-dev/motor-registry-lv-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:49 | 200 | 52.039µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:47:49 | 200 | 34.271µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:47:50 | 200 | 2m44s | 10.162.215.11 | GET "/api/helm/releases/policies-dev/policies-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:52 | 200 | 2m46s | 10.162.215.11 | GET "/api/helm/releases/quick-search-v2-dev/quick-search-v2-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:53 | 200 | 2m48s | 10.162.215.11 | GET "/api/helm/releases/profile-dev/profile-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:55 | 200 | 2m49s | 10.162.215.11 | GET "/api/helm/releases/pricing-engine-dev/pricing-engine-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:57 | 200 | 2m51s | 10.162.215.11 | GET "/api/helm/releases/product-packages-dev/product-packages-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:58 | 200 | 2m53s | 10.162.215.11 | GET "/api/helm/releases/profile-doors-sync-dev/profile-doors-sync-dev/resources?health=true"
[GIN] 2024/06/10 - 12:47:59 | 200 | 45.185µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:47:59 | 200 | 36.113µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:00 | 200 | 2m54s | 10.162.215.11 | GET "/api/helm/releases/secret-manager-dev/secret-manager-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:02 | 200 | 2m56s | 10.162.215.11 | GET "/api/helm/releases/swarm-gateway-dev/swarm-gateway-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:03 | 200 | 2m58s | 10.162.215.11 | GET "/api/helm/releases/saikas-dms-middleware-stage/saikas-dms-middleware-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:05 | 200 | 2m59s | 10.162.215.11 | GET "/api/helm/releases/profile-sales-sync-dev/profile-sales-sync-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:07 | 200 | 3m1s | 10.162.215.11 | GET "/api/helm/releases/saikas-proxy-dev/saikas-proxy-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:08 | 200 | 3m3s | 10.162.215.11 | GET "/api/helm/releases/quick-search-v2-stage/quick-search-v2-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:09 | 200 | 34.01µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:09 | 200 | 35.801µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:10 | 200 | 3m5s | 10.162.215.11 | GET "/api/helm/releases/profile-stage/profile-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:12 | 200 | 3m6s | 10.162.215.11 | GET "/api/helm/releases/time-machine-dev/time-machine-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:14 | 200 | 3m8s | 10.162.215.11 | GET "/api/helm/releases/saikas-proxy-stage/saikas-proxy-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:15 | 200 | 3m10s | 10.162.215.11 | GET "/api/helm/releases/profile-sales-sync-stage/profile-sales-sync-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:17 | 200 | 3m11s | 10.162.215.11 | GET "/api/helm/releases/product-packages-stage/product-packages-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:19 | 200 | 31.666µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:19 | 200 | 37.349µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:19 | 200 | 3m13s | 10.162.215.11 | GET "/api/helm/releases/seb-hh-portfolio-stage/seb-hh-portfolio-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:20 | 200 | 2m26s | 10.162.215.11 | GET "/api/helm/releases"
[GIN] 2024/06/10 - 12:48:21 | 204 | 54.039µs | 10.162.215.11 | GET "/api/helm/repositories/latestver?name=platform-dotnet-chart"
[GIN] 2024/06/10 - 12:48:21 | 200 | 43.608µs | 10.162.215.11 | GET "/api/helm/repositories/latestver?name=heartbeat"
[GIN] 2024/06/10 - 12:48:21 | 204 | 43.804µs | 10.162.215.11 | GET "/api/helm/repositories/latestver?name=heartbeat-prerequisites"
[GIN] 2024/06/10 - 12:48:21 | 204 | 35.803µs | 10.162.215.11 | GET "/api/helm/repositories/latestver?name=import-map-deployer"
[GIN] 2024/06/10 - 12:48:21 | 200 | 80.3µs | 10.162.215.11 | GET "/static/helm-gray-50.svg"
[GIN] 2024/06/10 - 12:48:23 | 200 | 1.915285787s | 10.162.215.11 | GET "/api/helm/releases/accident-dev/accident-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:24 | 200 | 3.544809793s | 10.162.215.11 | GET "/api/helm/releases/accident-stage/accident-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:26 | 200 | 5.245965256s | 10.162.215.11 | GET "/api/helm/releases/activity-tracker-dev/activity-tracker-dev/resources?health=true"
[GIN] 2024/06/10 - 12:48:28 | 200 | 6.914257095s | 10.162.215.11 | GET "/api/helm/releases/activity-tracker-stage/activity-tracker-stage/resources?health=true"
[GIN] 2024/06/10 - 12:48:29 | 200 | 38.601µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:29 | 200 | 40.45µs | 10.162.215.11 | GET "/status"
[GIN] 2024/06/10 - 12:48:29 | 200 | 8.62643479s | 10.162.215.11 | GET "/api/helm/releases/ap-sms-update-dev/ap-sms-update-dev/resources?health=true"
Any suggestions on how to improve the performance. Container compute resources looks like:
what is 10% of limit.
Screenshots
Additional information
No response
With the large and many charts, the requests for status and health take a while. The problem exists, and we need to find a way to fix it.
Charts are actually pretty much small. On the last screen u can see that history url loading took almost 3 mins it's not "a while" I would say :). Glad u are already aware of issues.
Yes performance of helm-dashboard degraded significantly when I have 50+ charts in my cluster. Most of the time dashboard requests are taking approx 1 min for /release and /history api endpoints.
Lot of time requests are getting timed-out. Such slow performance making this dashboard useless.
Please note I have provided 2vCPU core still performance is not improved.
Hi @undera do you have any plans to work on performance improvements ?
Hi @undera do you have any plans to work on performance improvements ?
Right now, main job takes most of my time. I'm open for contributions and collaboration, though.
@harshit-mehtaa and @andriktr, we are open to contributions :) Another option is to use Komodor, where we have those capabilities and much more, designed to scale (hundreds of helm charts, thousands of clusters ).
I recently face the performance issues using helm-dashboard (my own fork), here's what I did to improve it:
- I modified the frontend of installed list to make it load lazily by suspending the resource API until the release is in view I introduced a dependency to do this
import { useInView } from 'react-intersection-observer' - I cached some "hot" resources, such as ConfigMaps, Deployments, ServiceAccounts using shared index informer,
GetResourceInfolooks up in cache first. All informers are managed using an LRU cache with configurable size. I usedgithub.com/hashicorp/golang-lru/v2library for this, which is pretty handy.
The cache part should be carefully designed to prevent frequent initial lists, which may introduce greate overhead on apiserver and network
@wylswz Thanks for sharing your findings. Maybe you are willing to contribute your FE changes from #1 into project via PR?
Regarding the cache - it needs to be carefully considered, because of risk of showing outdated data while in cluster it has changed.