krr
krr copied to clipboard
Prometheus workload loader
- [x] Implemented ability to select an alternative workload loader
- [x] Implemented Prometheus workload loader
- [x] Refactored Kube API workload loader: moved separate logic (for each Kind) to a separate class
- [x] List clusters on centralized prometheus
- [x] Scanning multiple clusters in prometheus mode?
- [ ] Test running from robusta UI
- [ ] Test prometheus mode on centralized prometheus
- [x] A lot of testing
- [x] Fix tests (mocks are now broken)
Testing issues found:
- [x] For some reason auto-discovery does not work
- [x] Prometheus mode does not gather HPA data
- [x] Something is now broken with HPAKey
When testing using the prometheus mode on a centralized Prometheus (VictoriaMetrics), I encountered the following errors.
I specified the labels --prometheus-label
and --prometheus-cluster-label
according to the docs:
python krr.py simple -p https://vmauth.xxx.com --prometheus-auth-header "<redacted>" --prometheus-label cluster -l k8s0-q --mode prometheus
To me it seems that the query includes the wrong label here, and should be avg by(cluster)
instead of using the cluster name.
Hi @deutschj, You're correct of course! I've pushed a fix. Can you please test?
@deutschj please note that I changed the name of the flags a little -- see --help
if necessary.
@aantn Sure, thanks a lot for working on the bugfix!
Whett testing with the options --prometheus-cluster-key
and -l
I encountered the following error:
krr git:(prometheus-workload-loader) python krr.py simple --prometheus-url https://vmauth.xxx.com --prometheus-auth-header "<redacted>" --mode prometheus --prometheus-cluster-key cluster -l k8s0-q
Running Robusta's KRR (Kubernetes Resource Recommender) 1.8.2-dev
Using strategy: Simple
Using formatter: table
A newer version of KRR is available: v1.11.0
[08:22:56] INFO Connecting using Prometheus, will load the kubeconfig.
INFO Using Prometheus at https://vmauth.xxx.com
INFO Prometheus found
INFO Prometheus connected successfully
[08:22:57] INFO Clusters available: k8s0-q, <cluster1>, <cluster2>, [...]
CRITICAL Cannot scan multiple clusters for this prometheus, Rerun with the flag `-c <cluster>` where <cluster> is one of ['k8s0-q', '<cluster1>', '<cluster2>', [...]]
However, when providing only the -l
option without the --prometheus-cluster-key
, KRR starts to generate recommendations for the selected cluster. When generating, the following warnings are displayed for every existing workload:
WARNING Prometheus returned no PercentileCPULoader metrics for StatefulSet xxx
WARNING Prometheus returned no MaxMemoryLoader metrics for StatefulSet xxx
WARNING Prometheus returned no CPUAmountLoader metrics for StatefulSet xxx
WARNING Prometheus returned no MemoryAmountLoader metrics for StatefulSet xxx
INFO Calculated recommendations for StatefulSet xxx (using 4 metrics)
And the table with the generated resource recommendations in the end is empty though. So looks like now getting the existing workloads in the cluster works correctly, but getting the corresponding metrics from the centralized Prometheus doesn't yet.
@deutschj does it work if you provide --prometheus-cluster-key
, -l
, and -c
with -c
set to the same value as -l
?
@aantn Unfortunately not, no - this yields the same error as above, CRITICAL Cannot scan multiple clusters for this prometheus, Rerun with the flag '-c <cluster>' where <cluster> is one of [...]
Would love to see this merged, using KRR from this feature branch for some time now
Hey, we're planning to get it merged, but no exact ETA yet.
hello. Are there any plans for this branch to be merged?
hey @wad-hongsumin
We do plan to merge it, hopefully soon
@arikalon1 thank you. I'm hoping this PR gets merged quickly.