bottlerocket
bottlerocket copied to clipboard
Missing runtime metrics from cAdvisor
Platform I'm building on:
EKS
What I expected to happen:
I'd expect to see the cAdvisor runtime values as part of the kubelet metrics.
What actually happened:
The kubelet cAdvisor metrics are missing the runtime values meaning we can't see the node usage.
How to reproduce the problem:
Run kubectl get --raw "/api/v1/nodes/${NODE_NAME}/proxy/stats/summary" | jq -r '.node.systemContainers[] | .name' and you'll see the following response.
kubelet
pods
This the same as https://github.com/awslabs/amazon-eks-ami/issues/1667; the solution I added there would also seem to be relevant in the context of Bottlerocket.
Hi @stevehipwell , thanks for letting us know! Let me give your solution a try and confirm. Just to keep a record, at some point we moved both containerd.service and kubelet.service to be under the runtime.slice cgroup. It looks like cadvisor may need to know which cgroups it should track in order to provide the metrics:
https://github.com/kubernetes/kubernetes/blob/ad6477e342c8ce0f9b1997d5345322c930f6911d/cmd/kubelet/app/server.go#L722
@arnaldo2792 thanks. I've not had a chance to check out the actual code in detail but I'm surprised this doesn't just work. The kubelet should have all of the information required to automate this.
Hi @stevehipwell, thanks for the issue. I had a chance this afternoon to repro as well as attempt your suggested fix, and it worked :D I'm going to test some more but I'll get a PR when ready
@arnaldo2792 and @ginglis13 is this related somehow to #2743?
Yes it is