kepler
kepler copied to clipboard
Kepler (Kubernetes-based Efficient Power Level Exporter) uses eBPF to probe performance counters and other system stats, use ML models to estimate workload energy consumption based on these stats, and...
### What happened? When deploying the kepler operator (default instance), after some time the thanos ruler takes too long to exec the rules and start raising this alert: Thanos Rule...
### What would you like to be added? add a new steps for using SBOM as tool as https://github.com/devops-kung-fu/bomber/tree/main to read our SBOM file to fix CVE. btw, if anyone...
Currently `pkg/bpfassets` contains many global variables that are accessed from many other packages. This makes it hard for both the `bpfAssets` package and other packages that depend on it to...
This fixes a few issues identified with the bpf code format and refactor. 1. Zero initialize all variables 2. Use the bpf_perf_event_read_value helper exlusively 3. Cherry-pick of PR #1410 -...
### What would you like to be added? There is currently a single BCC unit test which appears to be skipped on CI. We should have a test that: 1....
### What happened? When Kepler using the `latest` is deployed against OpenShift 4.15(VM/BM) it does not produce any metric value. All the metric values are 0. Some screenshots for reference:...
### What happened? With [this patch](https://github.com/dave-tucker/kepler/commit/4af706a7b1303227251a13bdef7ae545f9787113) applied the bpf probes are unable to be loaded due to a verifier error. There is a `TODO` in the code to track fixing...
WIP: Updated prometheus to compare metrics based on timestamps. Currently investigating inconsistencies in prometheus timestamp datapoints, between requests api and prometheus api client functions. Signed-off-by: Kaiyi
Tested on a setup with Habana ``` kepler_node_gpu_joules_total{instance="",mode="dynamic",package="0",source="habana"} 206.895 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="1",source="habana"} 345.609 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="2",source="habana"} 128.661 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="3",source="habana"} 362.46 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="4",source="habana"} 157.596 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="5",source="habana"} 327.219 kepler_node_gpu_joules_total{instance="",mode="dynamic",package="6",source="habana"} 299.94 kepler_node_gpu_joules_total{instance="",mode="idle",package="0",source="habana"} 1314.276 kepler_node_gpu_joules_total{instance="",mode="idle",package="1",source="habana"} 921.663 kepler_node_gpu_joules_total{instance="",mode="idle",package="2",source="habana"} 1363.764 kepler_node_gpu_joules_total{instance="",mode="idle",package="3",source="habana"} 988.407 kepler_node_gpu_joules_total{instance="",mode="idle",package="4",source="habana"}...
This removes the BCC code from the repository and from the bpfassets package. The bpfassets package has been simplified given we will only support a single "attacher". Fixes: #1390