microk8s icon indicating copy to clipboard operation
microk8s copied to clipboard

Kubelite not starting after power failure unless cgroups-per-qos=false

Open AlexGustafsson opened this issue 1 year ago • 2 comments

Summary

After the host had been shut down abruptly, microk8s (kubelite) would no longer start due to the following error:

Jan 24 19:02:35 bernd microk8s.daemon-kubelite[2373]: E0124 19:02:35.011772    2373 kubelet.go:1542] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist"

After having applied the workaround mentioned by @neoaggelos in https://github.com/canonical/microk8s/issues/4301#issuecomment-1810061954, microk8s started.

Now microk8s cannot start without those changes.

What Should Happen Instead?

Microk8s should start without having to disable cgroups per qos.

Reproduction Steps

None.

Introspection Report

inspection-report-20240124_193532.tar.gz

AlexGustafsson avatar Jan 24 '24 19:01 AlexGustafsson

During the last boot, before the power outage, the host had been running for a long time. microk8s had been updated from 1.26, through 1.27, 1.28 to 1.29 without a reboot. So the power cycle might just have exposed issues that would otherwise have shown.

I haven't found anything in the patch notes that suggest that there's some change in how cgroups works lately. The computer haven't been configured any different since it was working. So I'm unsure what would make cgroups misbehave (as suggested in #4301).

AlexGustafsson avatar Jan 24 '24 19:01 AlexGustafsson

Hi @AlexGustafsson, thank you for raising this. This has been an issue we are seeing with MicroK8s 1.29 recently, see also #4361. I wonder if you are bumping into the same problem.

neoaggelos avatar Jan 25 '24 09:01 neoaggelos