krr icon indicating copy to clipboard operation
krr copied to clipboard

Update memory_query and cpu_query for Prometheus

Open tjmoyes opened this issue 8 months ago • 2 comments

machine_memory_bytes and machine_cpu_cores have been [deprecated] (https://github.com/kubernetes/kube-state-metrics/blob/main/docs/metrics/cluster/node-metrics.md) [as per StackOverflow] (https://stackoverflow.com/questions/63901926/how-to-query-the-total-memory-available-to-kubernetes-nodes)

This issue came up for me when querying Azure Managed Prometheus, as these queries no longer exist there. I've tested these queries on both manually deployed Prometheus as well as Azure Managed Prometheus and the query works and returns the same data.

tjmoyes avatar Mar 25 '25 22:03 tjmoyes

CLA assistant check
All committers have signed the CLA.

CLAassistant avatar Mar 25 '25 22:03 CLAassistant

Walkthrough

The Prometheus queries for cluster summary metrics were updated to retrieve data from kube_node_status_capacity with resource labels instead of machine_memory_bytes and machine_cpu_cores. Data gathering logic and control flow remain unchanged.

Changes

Cohort / File(s) Summary
Prometheus metric queries
robusta_krr/core/integrations/prometheus/metrics_service/prometheus_metrics_service.py
Updated get_cluster_summary queries to use kube_node_status_capacity metric with resource labels (cpu, memory) instead of machine_memory_bytes and machine_cpu_cores for computing cluster resource totals

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10–15 minutes

  • Verify that kube_node_status_capacity metric is available in typical Prometheus setups and returns equivalent data
  • Confirm the resource label filtering (cpu, memory) correctly aggregates cluster totals
  • Check for any edge cases where the new metric might behave differently from the previous sources

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title "Update memory_query and cpu_query for Prometheus" accurately summarizes the main change in the changeset. According to the raw summary, the changes switch from using deprecated metrics (machine_memory_bytes and machine_cpu_cores) to using kube_node_status_capacity for cluster memory and CPU queries in Prometheus. The title is concise, specific, and clearly indicates what is being updated and for which service, allowing teammates scanning the history to quickly understand the primary purpose of this PR.
Description Check ✅ Passed The pull request description is directly related to the changeset and provides meaningful context for the changes. It explains that the deprecated metrics (machine_memory_bytes and machine_cpu_cores) caused issues with Azure Managed Prometheus, references relevant documentation, and states that the author tested the new queries on both manually deployed Prometheus and Azure Managed Prometheus to verify they return equivalent data. The description is neither vague nor off-topic; it clearly communicates the motivation and validation for the update.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Oct 26 '25 17:10 coderabbitai[bot]