kepler
kepler copied to clipboard
Kind returns same value for all Pods in a namespace
Describe the bug All Pods return the same value in a kind cluster.
To Reproduce On a MacOS, I created a Vagrant VM with VirtualBox as a driver to run Ubuntu 2204. I ensured that the VM meets all the Kepler requirements (kernel headers). Then, I created a kind cluster and bootstrapped Flux on it.
Below are the metrics for the Pods that I wanted to measure. They are the Pods containing the Flux controllers, all in the flux-system
namespace.
The issue is that they all have the same value
, which shouldn't be the case:
curl -G http://localhost:9090/api/v1/query --data-urlencode "query=pod_curr_energy_in_pkg_millijoule{pod_namespace='flux-system'}" | jq
{
"status": "success",
"data": {
"resultType": "vector",
"result": [
{
"metric": {
"__name__": "pod_curr_energy_in_pkg_millijoule",
"command": "helm-contr",
"container": "kepler-exporter",
"endpoint": "http",
"instance": "kind-control-plane",
"job": "kepler-exporter",
"namespace": "kepler",
"pod": "kepler-exporter-7nbzs",
"pod_name": "helm-controller-9b6bb4f68-k2vp7",
"pod_namespace": "flux-system",
"service": "kepler-exporter"
},
"value": [
1664036406.998,
"1"
]
},
{
"metric": {
"__name__": "pod_curr_energy_in_pkg_millijoule",
"command": "kustomize-",
"container": "kepler-exporter",
"endpoint": "http",
"instance": "kind-control-plane",
"job": "kepler-exporter",
"namespace": "kepler",
"pod": "kepler-exporter-7nbzs",
"pod_name": "kustomize-controller-7f4687b878-65q97",
"pod_namespace": "flux-system",
"service": "kepler-exporter"
},
"value": [
1664036406.998,
"1"
]
},
{
"metric": {
"__name__": "pod_curr_energy_in_pkg_millijoule",
"command": "source-con",
"container": "kepler-exporter",
"endpoint": "http",
"instance": "kind-control-plane",
"job": "kepler-exporter",
"namespace": "kepler",
"pod": "kepler-exporter-7nbzs",
"pod_name": "source-controller-f8d655bdc-4fw6n",
"pod_namespace": "flux-system",
"service": "kepler-exporter"
},
"value": [
1664036406.998,
"1"
]
},
{
"metric": {
"__name__": "pod_curr_energy_in_pkg_millijoule",
"command": "tini",
"container": "kepler-exporter",
"endpoint": "http",
"instance": "kind-control-plane",
"job": "kepler-exporter",
"namespace": "kepler",
"pod": "kepler-exporter-7nbzs",
"pod_name": "notification-controller-7f5dbddc94-rtdhq",
"pod_namespace": "flux-system",
"service": "kepler-exporter"
},
"value": [
1664036406.998,
"1"
]
}
]
}
}
Expected behavior I expected to have more granular data according to the activity of each Pod.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: MacOS with a Vagrant VirtualBox VM (Ubuntu 2204)
Additional context This may very well be an issue due to running a kind cluster inside of a VM!
@nikimanoledaki thank you for reporting this issue! Are you running on mac with ARM or x86 CPUs?
would you please get the head of the kepler pod log?
kubectl logs -n kepler daemonset/kepler-exporter |head -100
I am guessing you are seeing similar issue to me https://github.com/sustainable-computing-io/kepler/issues/211#issuecomment-1253068685 , looks like the kind on a VM might need more work
Not sure how macOS handles kind. On my setup (KVM guest on RHEL), only one pod is reported
[root@kind-control-plane /]# curl http://10.96.206.34:9102/metrics |grep pod_total_energy_millijoule
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 15976 0 15976 0 0 15.2M 0 --:--:-- --:--:-- --:--:-- 15.2M
# HELP pod_total_energy_millijoule pod_ total energy consumption (millijoule)
# TYPE pod_total_energy_millijoule counter
pod_total_energy_millijoule{command="docker-pro",pod_name="system_processes",pod_namespace="system"} 964
pod_total_energy_millijoule{command="irqbalance",pod_name="kube-scheduler-kind-control-plane",pod_namespace="kube-system"} 4
On Ubuntu it's same (a Kind on KVM VM) , and the command="cinder-sch" is really weird as well (guess that's first pod it met)
root@kind-control-plane:/# curl localhost:9102/metrics |grep pod_total_energy_millijoule
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 13445 0 13445 # HELP pod_total_energy_millijoule pod_ total energy consumption (millijoule)
# TYPE pod_total_energy_millijoule counter
0pod_total_energy_millijoule{command="cinder-sch",pod_name="system_processes",pod_namespace="system"} 238
0 4376k 0 --:--:-- --:--:-- --:--:-- 4376k
I suspect BM => VM => kind is valid (at least for now) use case ,as both k8s and openshift doesn't have production env running like this model per https://github.com/sustainable-computing-io/kepler/issues/211#issuecomment-1252299478, I think
if we agree it's not a valid use case (production env) ,maybe we need doc somewhere
So, just to double check, you are running inside a VM right?
So, just to double check, you are running inside a VM right?
yes, BM => VM => kind, not sure @nikimanoledaki is or not
Hey all, sorry for the very late reply. Yes, I was running kind inside a VM:
MacOs -> VirtualBox VM running Ubuntu 2204 -> Kind
I ensured that the VM had kernel headers.
Maybe it was not valid because MacOs was used as host. However, this was an attempt to find a working dev env for people who would like tro try Kepler using MacOs and don't have access to any baremetal machine.
This could potentially work if the CPU Arch is not discovered and instead is overridden with an env var as introduced by this PR by @rootfs: https://github.com/sustainable-computing-io/kepler/pull/278
WDYT? I haven't tested it - does anyone else have MacOs and would like to try? I could make time to try in the next few weeks
@nikimanoledaki this might be related to the issue #388 to enable the model server.
But what is happening is that since it's a VM it doesn't normally have power metrics so it's using the power estimation for the node energy consumption. Also, since the VM probably does not have hardware counters, it is not calculating the resource usage rate to determine power consumption per container. So, it's probably dividing the host power consumption evenly across all containers. This is why we are seeing the same power consumption for all containers.
To get a better estimating on VMs you need to use the estimator and/or model server….See #388
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@nikimanoledaki did you manage to collect metrics?
I recently tried creating a similar setup on macOS + VirtualBox but unfortunately VirtualBox is no longer supported on the macOS Ventura, which is the latest version: https://github.com/kubernetes/minikube/issues/15274 Since the setup environment is not supported at the moment or in the foreseeable future, it should be ok to close this issue. I have not managed to make Kepler work in a VM on MacOS in any other way so I use a Linux machine instead when I have to run Kepler, so I am not blocked either.