kepler
kepler copied to clipboard
Cannot get pod energy information
Hi, I'm trying to use Kepler in my k8s cluster. It was deployed on one node (node1) together with Prometheus and Grafana. There are many pods running on this node. I was expecting all the pod energy can be displayed in Grafana dashboard, however I can only see one record of "pod_energy_stat" in Prometheus, which is about "pod_name="system_processes", pod_namespace="system"", and this pod/namespace doesn't even exist in my cluster. Do you have any clue on what the issue is about?
thank you for reporting this issue. In your issue, it looks Kepler failed to resolve the pod and attributed all energy usage to the system process (i.e. using system_process
as pod name and system
as pod namespace). If Kepler cannot find the pods, the root cause are likely in the container runtime and kubelet port in your cluster.
Can you check the following and share your cluster info?
- do you use cgroup v1 or v2? v2 is the assumed default. If you use v1, please turn this option on
- what is the container runtime? We've tested cri-o. If your container runtime is different, there might be a different sysfs path for the pod, and that makes Kepler unable to find the pod name
- which port does kublet uses? Kepler assumes default port 10250. If your kubelet runs on a different port, Kepler may not be able to find out. The new port has to be set in the env var
- do you have the kepler log available, i.e. the output of
kubectl logs -n monitoring daemonset/kepler-exporter
?
cc @marceloamaral
Thanks for prompt response. Please see my answers inline.
- do you use cgroup v1 or v2? v2 is the assumed default.
- I think my kernel supports both v1 and v2 $ mount | grep '^cgroup' | awk '{print $1}' | uniq cgroup2 cgroup
- what is the container runtime?
- It's containerd://1.5.9-0ubuntu3
- which port does kubelet uses?
- It should be the default port.
- do you have the kepler log available?
- The log is like the following, all about "system_processes" in namespace "system" 2022/08/16 03:20:48 energy count: core 126086.00 dram: 6008.00 time 0.000000 cycles 28257170458 instructions 30679042784 misses 33231048 node memory 7492431872.000 2022/08/16 03:20:48 energy from pod: name: system_processes namespace: system eCore: 183834(48858366737) eDram: 132(9223372036862737727) eOther: 0(0) eGPU: 0(0) CPUTime: 0.00 (NaN) cycles: 28257170458 (1.0000) instructions: 30679042784 (1.0000) DiskReadBytes: 0 (0) DiskWriteBytes: 0 (0) misses: 33231048 (1.0000) ResidentMemRatio: 0.1929 avgCPUFreq: 1707.8185 MHZ pid: 1002212 comm: containerd-shim cgroupfs: map[]
Thanks, I'll check a setup using containerd
turns out containerd on my setup has a different path pattern:
# kubectl describe pod nginx-7cd588b686-mkpzs |grep "Container ID"
Container ID: containerd://286b15051ec43375190802e1f40562536980a8fd97e75bb89c7f2eec6f995f17
# find /sys/fs/cgroup/systemd/ -iname "*286b15051ec43375190802e1f40562536980a8fd97e75bb89c7f2eec6f995f17"
/sys/fs/cgroup/systemd/system.slice/containerd.service/kubepods-burstable-poda3b200c9_db51_40b4_9d2d_53f8fdf80d7f.slice:cri-containerd:286b15051ec43375190802e1f40562536980a8fd97e75bb89c7f2eec6f995f17
while the regex used to parse container path doesn't capture this pattern
cc @marceloamaral
this was tested on ubuntu 20.04 with containerd version
# containerd -v
containerd github.com/containerd/containerd 1.5.9-0ubuntu1~20.04.4
@ruomengh would you like to provide a fix for this? We can go through the development process to help you get started.
@ruomengh would you like to provide a fix for this? We can go through the development process to help you get started.
Sounds good. I'd like to have a try but I may need some guidance.
I've email the development process, please let me know if you have any issues there, looking forward to your contribution!
@ruomengh can you try the latest kepler container image? Deleting the kepler deployment and recreating it will do.
I just hit the same issue on RHEL 8 with cri-o, but it seems #108 fixed it
cc @sunya-ch
Re-deploy kepler with the latest image and the issue remains.
@ruomengh What is the detected kernel version? Could you post head part of the kepler?
> kubectl logs -n $(kubectl get po -A|grep kepler-exporter|awk '{print $1,$2}')|head
2022/08/18 01:50:57 InitSliceHandler: &{map[] /sys/fs/cgroup/system.slice /sys/fs/cgroup/system.slice /sys/fs/cgroup/system.slice}
use sysfs to obtain power
config EnabledEBPFCgroupID enabled: true
config getKernelVersion: 4.18
config set EnabledEBPFCgroupID to true
2022/08/18 01:33:03 InitSliceHandler: &{map[] /sys/fs/cgroup/cpu /sys/fs/cgroup/memory /sys/fs/cgroup/blkio} use sysfs to obtain power config EnabledEBPFCgroupID enabled: true config getKernelVersion: 5.15 config set EnabledEBPFCgroupID to false
2022/08/18 01:33:03 InitSliceHandler: &{map[] /sys/fs/cgroup/cpu /sys/fs/cgroup/memory /sys/fs/cgroup/blkio} use sysfs to obtain power config EnabledEBPFCgroupID enabled: true config getKernelVersion: 5.15 config set EnabledEBPFCgroupID to false
According to this log, the only condition that causes EnabledEBPFCgroupID
disabled is cgroup.
The current implementation check the cgroup from this static path exists /sys/fs/cgroup/cgroup.controllers
.
Could you confirm that this path exists on your host?
If exists on host, double check that Kepler deployment manifest is mounted to /sys path (this should be defined in the provided manifest). If not, I think we need to fix the method to detect cgroup version to cover this.
related source code: https://github.com/sustainable-computing-io/kepler/blob/ef763b68e9a11e956936de06b8a4e8af94458f58/pkg/config/config.go#L69
The path doesn't exist - ls: cannot access '/sys/fs/cgroup/cgroup.controllers': No such file or directory
I see. It seems like your system is not installed cgroupv2 by this approach. So that the file isn't created there as expected from the issue https://github.com/sustainable-computing-io/kepler/issues/29.
I think we should change the way to detect cgroupv2 by mount point on host as you did or check /proc/filesystems
file.
Seems cgroup v2 is disabled on the node to avoid another issue. I tried this for cgroup v1 as @rootfs suggested but it doesn't work either https://github.com/sustainable-computing-io/kepler/blob/main/manifests/kubernetes/deployment.yaml#L71
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Since we support cgroup v1 now, @ruomengh do we still have this issue?
@jiere I assume the issue has been gone in our environment. Please help to confirm. Thanks.
Using quay.io/sustainable_computing_io/kepler:release-0.5
on Fedora Core 33 with cGroup v1 (systemd.unified_cgroup_hierarchy=0
in grub) does not work as expected:
See bellow an extract of the logs (verbose=5) where every container has the same counters:
I0623 15:45:09.166734 1 metric_collector.go:137] energy from pod/container (0 active processes): name: csi-cinder-nodeplugin-4jx6c/cinder-csi-plugin namespace: kube-system
cgrouppid: 0 pid: [] comm:
Dyn ePkg (mJ): 9512 (4461128) (eCore: 9512 (4461128) eDram: 150 (70350) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0)
Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0)
CPUTime: 0 (0)
NetTX IRQ: 0 (0)
NetRX IRQ: 0 (0)
Block IRQ: 0 (0)
counters: map[cache_miss:0 (0) cpu_cycles:0 (0) cpu_instr:0 (0) cpu_ref_cycles:0 (0)]
cgroupfs: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)]
kubelets: map[container_cpu_usage_seconds_total:0 (1503) container_memory_working_set_bytes:0 (13774848)]
I0623 15:45:09.166748 1 metric_collector.go:137] energy from pod/container (0 active processes): name: csi-cinder-nodeplugin-4jx6c/node-driver-registrar namespace: kube-system
cgrouppid: 0 pid: [] comm:
Dyn ePkg (mJ): 9512 (4461128) (eCore: 9512 (4461128) eDram: 150 (70350) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0)
Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0)
CPUTime: 0 (0)
NetTX IRQ: 0 (0)
NetRX IRQ: 0 (0)
Block IRQ: 0 (0)
counters: map[cache_miss:0 (0) cpu_cycles:0 (0) cpu_instr:0 (0) cpu_ref_cycles:0 (0)]
cgroupfs: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)]
kubelets: map[container_cpu_usage_seconds_total:0 (13) container_memory_working_set_bytes:0 (3063808)]
Using
quay.io/sustainable_computing_io/kepler:release-0.5
on Fedora Core 33 with cGroup v1 (systemd.unified_cgroup_hierarchy=0
in grub) does not work as expected:See bellow an extract of the logs (verbose=5) where every container has the same counters:
I0623 15:45:09.166734 1 metric_collector.go:137] energy from pod/container (0 active processes): name: csi-cinder-nodeplugin-4jx6c/cinder-csi-plugin namespace: kube-system cgrouppid: 0 pid: [] comm: Dyn ePkg (mJ): 9512 (4461128) (eCore: 9512 (4461128) eDram: 150 (70350) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) CPUTime: 0 (0) NetTX IRQ: 0 (0) NetRX IRQ: 0 (0) Block IRQ: 0 (0) counters: map[cache_miss:0 (0) cpu_cycles:0 (0) cpu_instr:0 (0) cpu_ref_cycles:0 (0)] cgroupfs: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] kubelets: map[container_cpu_usage_seconds_total:0 (1503) container_memory_working_set_bytes:0 (13774848)] I0623 15:45:09.166748 1 metric_collector.go:137] energy from pod/container (0 active processes): name: csi-cinder-nodeplugin-4jx6c/node-driver-registrar namespace: kube-system cgrouppid: 0 pid: [] comm: Dyn ePkg (mJ): 9512 (4461128) (eCore: 9512 (4461128) eDram: 150 (70350) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) Idle ePkg (mJ): 0 (0) (eCore: 0 (0) eDram: 0 (0) eUncore: 0 (0)) eGPU (mJ): 0 (0) eOther (mJ): 0 (0) CPUTime: 0 (0) NetTX IRQ: 0 (0) NetRX IRQ: 0 (0) Block IRQ: 0 (0) counters: map[cache_miss:0 (0) cpu_cycles:0 (0) cpu_instr:0 (0) cpu_ref_cycles:0 (0)] cgroupfs: map[block_devices_used:0 (0) cgroupfs_cpu_usage_us:0 (0) cgroupfs_ioread_bytes:0 (0) cgroupfs_iowrite_bytes:0 (0) cgroupfs_kernel_memory_usage_bytes:0 (0) cgroupfs_memory_usage_bytes:0 (0) cgroupfs_system_cpu_usage_us:0 (0) cgroupfs_tcp_memory_usage_bytes:0 (0) cgroupfs_user_cpu_usage_us:0 (0)] kubelets: map[container_cpu_usage_seconds_total:0 (13) container_memory_working_set_bytes:0 (3063808)]
@marceloamaral
@rootfs the problem is the BPF code.
With the apiserver we can now identify the containers, but the apiserver does not give the PIDs into the containers. You can see in the logs that all pid list are empty. The cgroup metrics can only be extracted if we have the pid of a process in the cgroup....
The PID information comes from the eBPF code, which is probably not working.
How is this environment? Is it bare-metal or VM?
How k8s was deployed? Mini-clusters like Kind and minikube does not expose the host /proc
folder by default.
So, when we create a kind cluster we mount the host folder so that Kepler can access the pid informations.
This Kubernetes environment has been deployed using Openstack Magnum on VMs.
@fadam-csgroup
Can you check if you have visibility of the PIDs?
The command oc exec -it -n kepler service/kepler-exporter -- ls /proc
should show a lot of PIDs.
@marceloamaral
Yes, the command give me lots of PIDs:
$ kubectl -n monitoring exec -it ds/kepler -- ls /proc -1 | grep -E '^[0-9]+$' | wc -l
553
@fadam-csgroup can you please share the Kepler logs? Put it in https://pastebin.com/
@marceloamaral Here are the logs: https://pastebin.com/2i7RX0Y8
I0629 12:29:27.877410 1 process_metric.go:147] cannot extract: container_cpu_usage_seconds_total
I0629 12:29:27.877413 1 process_metric.go:147] cannot extract: container_memory_working_set_bytes
I0629 12:29:27.877418 1 process_metric.go:147] cannot extract: cgroupfs_memory_usage_bytes
I0629 12:29:27.877421 1 process_metric.go:147] cannot extract: cgroupfs_kernel_memory_usage_bytes
I0629 12:29:27.877424 1 process_metric.go:147] cannot extract: cgroupfs_tcp_memory_usage_bytes
I0629 12:29:27.877427 1 process_metric.go:147] cannot extract: cgroupfs_cpu_usage_us
I0629 12:29:27.877431 1 process_metric.go:147] cannot extract: cgroupfs_system_cpu_usage_us
I0629 12:29:27.877434 1 process_metric.go:147] cannot extract: cgroupfs_user_cpu_usage_us
this is probably the same issue as https://github.com/sustainable-computing-io/kepler/discussions/750#discussioncomment-6265641
@mcalman
@fadam-csgroup on your setup, is CPU and memory accounting turned on? The following is from my setup, the cpu and memory accounting are turned on for kubelet
# sudo systemctl show kubelet |grep -i accounting
CPUAccounting=yes
IOAccounting=no
BlockIOAccounting=yes
MemoryAccounting=yes
TasksAccounting=yes
IPAccounting=no
@rootfs
It seems that Memory accounting in enable but CPU is not:
# systemctl show kubelet |grep -i accounting
CPUAccounting=no
IOAccounting=no
BlockIOAccounting=no
MemoryAccounting=yes
TasksAccounting=yes
IPAccounting=no