Offline/unplugged CPUs are showing in container metrics when using `cgroup1`
With a cgroup1 VM with a single CPU (implied default limits.cpu=1), its guest instances are apparently seeing the other CPU cores that are "hotpuggable" in the VM:
sdeziel@sdeziel-lemur:~$ nproc
12
$ lxc exec v1 -- nproc
1
root@v1:~# nproc
1
root@v1:~# lxc query /1.0/metrics | grep ^lxd_cpu_seconds
lxd_cpu_seconds_total{cpu="0",mode="system",name="a1",project="default",type="container"} 0.150751482
lxd_cpu_seconds_total{cpu="0",mode="user",name="a1",project="default",type="container"} 0.043823309
lxd_cpu_seconds_total{cpu="2",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="2",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="3",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="3",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="4",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="4",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="7",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="7",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="10",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="10",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="1",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="1",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="5",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="5",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="6",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="6",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="8",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="8",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="9",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="9",mode="user",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="11",mode="system",name="a1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="11",mode="user",name="a1",project="default",type="container"} 0
Here's how to reproduce:
lxc launch ubuntu-daily:22.04 --vm v1
lxc exec v1 -- sed -i 's/console=ttyS0"/console=ttyS0 systemd.unified_cgroup_hierarchy=0"/' /etc/default/grub.d/50-cloudimg-settings.cfg
lxc exec v1 -- update-grub
lxc restart v1
lxc exec v1 -- lxd init --auto
lxc exec v1 -- lxc launch ubuntu-minimal:22.04 c1
lxc exec v1 -- lxc query /1.0/metrics | grep ^lxd_cpu_seconds
The metrics query should only report about cpu="0" but it reports 0 for other CPU cores that are not online/plugged:
$ lxc exec v1 -- lxc query /1.0/metrics | grep ^lxd_cpu_seconds
lxd_cpu_seconds_total{cpu="7",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="7",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="8",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="8",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="9",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="9",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="11",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="11",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="2",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="2",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="4",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="4",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="6",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="6",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="5",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="5",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="10",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="10",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="0",mode="system",name="c1",project="default",type="container"} 1.510519089
lxd_cpu_seconds_total{cpu="0",mode="user",name="c1",project="default",type="container"} 3.571449891
lxd_cpu_seconds_total{cpu="1",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="1",mode="user",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="3",mode="system",name="c1",project="default",type="container"} 0
lxd_cpu_seconds_total{cpu="3",mode="user",name="c1",project="default",type="container"} 0
$ nproc
12
$ lxc exec v1 -- nproc
1
$ lxc exec v1 -- snap list lxd
Name Version Rev Tracking Publisher Notes
lxd 5.0.3-babaaf8 27948 5.0/stable/… canonical✓ -
FYI, this is reproducible with 5.0/stable, 5.21/stable and latest/edge.
@mihalicyn is this expected?
@simondeziel how is the behavior different in cgroupv2?
@simondeziel how is the behavior different in cgroupv2?
With cgroup2 (default with 22.04, maybe 20.04 too?) only cpu="0" is reported about which seems to be expected https://github.com/canonical/lxd/blob/main/lxd/cgroup/abstraction.go#L340-L341 and cpu="0" is always online.
@simondeziel @mihalicyn please can you chat about this and figure out if we need to do anything here?
https://lore.kernel.org/all/[email protected]