microk8s
microk8s copied to clipboard
Microk8s v1.29 snap installation failed on plain Debian 12.4
Summary
The last days I noticed that the installation of MicroK8s v1.29/stable (6364) failed on a new (plain) Debian 12.4 system (tested on AWS EC2 with default Debian 12 image provided by AWS). After a few tests I can summarize the following behavior:
-
Installation of MicroK8s v1.28/stable (6089) on described Debian system via snap works like expected microk8s-1.28_6089-inspection-report-20240110_103728.tar.gz
-
Installation of MicroK8s v1.29/stable (6364) on described Debian system via snap failed and
microk8s inspect
responsed with:
admin@ip-172-31-16-112:~$ microk8s status
microk8s is not running. Use microk8s inspect for a deeper inspection.
admin@ip-172-31-16-112:~$ microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
sudo: unable to resolve host ip-172-31-16-112: Name or service not known
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6364/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball
Report tarball is at /var/snap/microk8s/6364/inspection-report-20240110_102300.tar.gz
microk8s_1.29_6364-inspection-report-20240110_102300.tar.gz
- Refreshing the v1.28 (6089) instance to v1.29 (6364) works at the first glance, but the inspect looks not well:
admin@ip-172-31-18-155:~$ microk8s kubectl get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/coredns-864597b5fd-k7hvt 1/1 Running 0 2m29s
kube-system pod/calico-kube-controllers-77bd7c5b-fp4zd 1/1 Running 0 2m29s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 2m35s
kube-system service/kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 2m32s
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/calico-node 1 1 1 1 1 kubernetes.io/os=linux 2m34s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 1/1 1 1 2m32s
kube-system deployment.apps/calico-kube-controllers 1/1 1 1 2m34s
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-864597b5fd 1 1 1 2m29s
kube-system replicaset.apps/calico-kube-controllers-77bd7c5b 1 1 1 2m29s
admin@ip-172-31-18-155:~$ microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
Inspecting dqlite
Inspect dqlite
Building the report tarball
Report tarball is at /var/snap/microk8s/6364/inspection-report-20240110_103926.tar.gz
microk8s-1.28_6089-refreshed-1.29_6364-inspection-report-20240110_103926.tar.gz
- The most strange thing is when I removed the MicroK8s package via
sudo snap remove --purge microk8s
and install the v1.29 (6364) again, the (one node) cluster seems to work like expected, but the inspect looks also not well:
admin@ip-172-31-18-155:~$ microk8s kubectl get all -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/calico-node-bggsw 1/1 Running 0 106s
kube-system pod/coredns-864597b5fd-wzdz9 1/1 Running 0 105s
kube-system pod/calico-kube-controllers-77bd7c5b-vlk94 1/1 Running 0 105s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.152.183.1 <none> 443/TCP 111s
kube-system service/kube-dns ClusterIP 10.152.183.10 <none> 53/UDP,53/TCP,9153/TCP 109s
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/calico-node 1 1 1 1 1 kubernetes.io/os=linux 111s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/coredns 1/1 1 1 110s
kube-system deployment.apps/calico-kube-controllers 1/1 1 1 111s
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/coredns-864597b5fd 1 1 1 106s
kube-system replicaset.apps/calico-kube-controllers-77bd7c5b 1 1 1 106s
admin@ip-172-31-18-155:~$ microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
sudo: unable to resolve host ip-172-31-18-155: Name or service not known
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6364/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball
Report tarball is at /var/snap/microk8s/6364/inspection-report-20240110_104641.tar.gz
microk8s-reinstall-1.29_6364-inspection-report-20240110_104641.tar.gz.tar.gz
What Should Happen Instead?
I hope somebody of the development team can find the reason for this behavior. I guess there is something installed on the host system during the v1.28 installation what failed in v1.29, and is not removed during snap remove --purge
process.
Reproduction Steps
Explained above (incl. inspection tar balls)
If there are any points left, I will try to answer your questions. Thanks!
For me.. snap remove --purge microk8s snap install microk8s --classic --channel=1.29/stable.
root@microk8s-master:~# microk8s status microk8s is not running. Use microk8s inspect for a deeper inspection.
The inspect log looks same as your first inspect output.
I don know Whats going on.
@odoo-sh thanks for your fast feedback.
Do you really mean my first inspect output, representing the output of a v1.28 installation without errors. Or do you mean my last output after a re-installation (microk8s-reinstall-1.29_6364-inspection-report-20240110_104641.tar.gz.tar.gz)?
Sorry, just to clarify.
root@microk8s-master:~# microk8s start
root@microk8s-master:~# microk8s status
microk8s is not running. Use microk8s inspect for a deeper inspection.
root@microk8s-master:~# microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6357/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball
Report tarball is at /var/snap/microk8s/6357/inspection-report-20240111_061543.tar.gz
Have the same problem on ubuntu server 22.04. snap install microk8s --classic --channel=1.29/stable
file localnode.yaml not exist
microk8s inspect
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6364/var/kubernetes/backend/localnode.yaml': No such file or directory
Same problem aswell. Also i'm wondering if you guys who got 1.29 running (e.g. by upgrading from 1.28) also can't use kubectl port-forward
? It's extremely slow.
Same issue on ubuntu desktop 23.10:
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6370/var/kubernetes/backend/localnode.yaml': No such file or directory
Hi @TecIntelli and other folks who are running into this, sorry for taking long to check this.
This seems to be related with cgroups, I see the following in the error logs (and I can also reproduce in Debian 12 systems)
Jan 10 10:16:39 ip-172-31-16-112 microk8s.daemon-kubelite[8441]: E0110 10:16:39.649969 8441 kubelet.go:1542] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist"
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: snap.microk8s.daemon-kubelite.service: Main process exited, code=exited, status=1/FAILURE
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: snap.microk8s.daemon-kubelite.service: Failed with result 'exit-code'.
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: snap.microk8s.daemon-kubelite.service: Consumed 6.137s CPU time.
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: snap.microk8s.daemon-kubelite.service: Scheduled restart job, restart counter is at 1.
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: Stopped snap.microk8s.daemon-kubelite.service - Service for snap application microk8s.daemon-kubelite.
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: snap.microk8s.daemon-kubelite.service: Consumed 6.137s CPU time.
Jan 10 10:16:39 ip-172-31-16-112 systemd[1]: Started snap.microk8s.daemon-kubelite.service - Service for snap application microk8s.daemon-kubelite.
One work-around for this is to disable this on the kubelet with:
echo '
--cgroups-per-qos=false
--enforce-node-allocatable=""
' | sudo tee -a /var/snap/microk8s/current/args/kubelet
sudo snap restart microk8s.daemon-kubelite
Afterwards, MicroK8s should be coming up. We will take this back to see what the root cause is and what sort of mitigations we could apply to prevent this in out of the box deployments.
To add some more details, this is what I'm seeing on a Debian 12 instance where I can reproduce the issue:
root@test-debian:/sys/fs/cgroup/kubepods# mount -t cgroup2
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
root@test-debian:/sys/fs/cgroup/kubepods# cat /sys/fs/cgroup/cgroup.controllers
cpuset cpu io memory hugetlb pids rdma misc
root@test-debian:/sys/fs/cgroup/kubepods# cat /sys/fs/cgroup/kubepods/cgroup.controllers
cpu io memory hugetlb pids rdma misc
root@test-debian:/sys/fs/cgroup/kubepods# echo '+cpuset' > cgroup.subtree_control
bash: echo: write error: No such file or directory
Let me drop some news to this issue we have found, regarding our initially mentioned problem. Maybe somebody else can explain more about the findings we have made.
It might be an issue with the used kernel 6.1 on Debian 12 (last try with latest version 6.1.69). When we upgraded the kernel to 6.5.10 manually, we could install Microk8s 1.29 latest/edge (6469) without problems and all expected pods came up properly.
Let me attach the inspect files just to compare if required: Kernel 6.1.69: debian12.4_kernel6.1.69-1_inspection-report-20240130_125914.tar.gz Kernel 6.5.10: debian12.4_kernel6.5.10-1~bpo12+1_inspection-report-20240130_130629.tar.gz
Additionally (with link to @neoaggelos detail information) we have also figured out the reason in Kernel 6.1.x might be a deligation issue. If we add the following before we install MicroK8s, the initial problem does not occur.
# mkdir -p /etc/systemd/system/[email protected]
# cat > /etc/systemd/system/[email protected]/delegate.conf << EOF
[Service]
Delegate=cpu cpuset io memory pids
EOF
# systemctl daemon-reload
github - opencontainers - cgroupv2 Let me also attach the inspect files with these settings: debian12.4_kernel6.1.69-1_inspection-report-20240130_135728.tar.gz
Hi @TecIntelli thanks a lot for looking deeper and coming up with a path towards a solution. It is still not too clear to me how we could handle this on the MicroK8s side, I do not think it's a good approach to mess with the system like this.
I spontaneously run into the same issue on a HA cluster running Ubuntu 22.04 LTS (Hetzner cloud server) and Microk8s 1.29/stable. Firstly, I spotted weird behavior on one faulty node of the HA cluster (container stayed in Terminating state, no deletion possible). After rebooting, I observed that microk8s status
output was flaky alternating between proper status reports, "not running" messages, and "random" execution errors. No issues were reported when running microk8s inspect
.
At some point I realized that journalctl -f -u snap.microk8s.daemon-kubelite
is logging too much with some hidden errors in between. It took me a while to understand that microk8s.daemon-kubelite is actually not starting (which was sadly not reflected by microk8s inspect
):
Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist"
After setting up a new clean machine Ubuntu 22.04.3 LTS and 1.29/stable (single node), I run into the same not starting microk8s.daemon-kubelite. On top I got the missing localnode.yaml error reported by @Zvirovyi earlier.
For now, I managed to restore the cluster by downgrading Microk8s to v1.28.3:
snap refresh microk8s --classic --channel=1.28/stable
PS: Adding and removing nodes from the HA cluster was very smooth in every stage, even with the "broken" 1.29/stable. Kudos to the maintainers!
@dimw Do you remember what kernel version run on your broken node with Ubuntu 22.04 and the new clean host with Ubuntu 22.04.3? I tested a new AWS EC2 instance with Ubuntu 22.04.3 LTS wihtout any problems. Snap package with MicroK8s 1.29/stable (6364) on a single node starts like expected. Kernel: 6.2.0-1018-aws
@TecIntelli I made a snapshot of the machine before purging it so I restored it now and checked the data. Both machines have the same configuration:
- Ubuntu 22.04.3 LTS
- Kernel: 5.15.0-92-generic
@dimw I was just curious and made a short test on an AWS EC2 instance with Ubuntu 22.04.3 and kernel 5.15.0-1052-aws. Unfortunately I cannot confirm your mentioned behavior when I installed MicroK8s 1.29/stable (6364) via snap. It seems to run smoothly, all pod came up as expected. The issue might be different.
Here the inspect file of the singe node instance ubuntu22.04.3_kernel5.15.0-1052-aws_inspection-report-20240131_132643.tar.gz
@TecIntelli I repeated the process yesterday and installed the newest Ubuntu on Hetzner Cloud and run into the following two issues again:
- microk8s.daemon-kubelite not starting
- error on Microk8s' inspect:
cp: cannot stat '/var/snap/microk8s/6364/var/kubernetes/backend/localnode.yaml': No such file or directory
Expand for details
$ apt update
$ apt upgrade -y
$ apt install snapd -y
$ snap install microk8s --classic --channel=1.29/stable
$ reboot # after kernel upgrade
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
$ uname -r
5.15.0-92-generic
$ microk8s start
$ microk8s inspect
microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6364/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball
Report tarball is at /var/snap/microk8s/6364/inspection-report-20240131_203342.tar.gz
$ journalctl -u snap.microk8s.daemon-kubelite -n 1000 | grep "err="
Jan 31 20:38:20 ubuntu-4gb-fsn1-2 microk8s.daemon-kubelite[65475]: E0131 20:38:20.663704 65475 kubelet.go:2353] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Jan 31 20:38:20 ubuntu-4gb-fsn1-2 microk8s.daemon-kubelite[65475]: E0131 20:38:20.721005 65475 container_manager_linux.go:881] "Unable to get rootfs data from cAdvisor interface" err="unable to find data in memory cache"
Jan 31 20:38:20 ubuntu-4gb-fsn1-2 microk8s.daemon-kubelite[65475]: E0131 20:38:20.772964 65475 kubelet.go:2353] "Skipping pod synchronization" err="[container runtime status check may not have completed yet, PLEG is not healthy: pleg has yet to be successful]"
Jan 31 20:38:20 ubuntu-4gb-fsn1-2 microk8s.daemon-kubelite[65475]: E0131 20:38:20.967043 65475 kubelet.go:1542] "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist"
I also tried the same with Ubuntu 20.04.6 LTS (Kernel: 5.4.0-170-generic) and getting the same microk8s inspect
error but
microk8s.daemon-kubelite
is starting and the cluster seems to be operational.
Hi all,
I have the same issue on Oracle Linux 9.3:
` microk8s inspect Inspecting system Inspecting Certificates Inspecting services Service snap.microk8s.daemon-cluster-agent is running Service snap.microk8s.daemon-containerd is running Service snap.microk8s.daemon-kubelite is running Service snap.microk8s.daemon-k8s-dqlite is running Service snap.microk8s.daemon-apiserver-kicker is running Copy service arguments to the final report tarball Inspecting AppArmor configuration Gathering system information Copy processes list to the final report tarball Copy disk usage information to the final report tarball Copy memory usage information to the final report tarball Copy server uptime to the final report tarball Copy openSSL information to the final report tarball Copy snap list to the final report tarball Copy VM name (or none) to the final report tarball Copy current linux distribution to the final report tarball Copy asnycio usage and limits to the final report tarball Copy inotify max_user_instances and max_user_watches to the final report tarball Copy network configuration to the final report tarball Inspecting kubernetes cluster Inspect kubernetes cluster Inspecting dqlite Inspect dqlite cp: cannot stat '/var/snap/microk8s/6641/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball Report tarball is at /var/snap/microk8s/6641/inspection-report-20240327_173242.tar.gz `
same on 22.0.4.4 ubuntu server
Same on Debian GNU/Linux 12 (bookworm)
microk8s.inspect:
Inspecting system Inspecting Certificates Inspecting services Service snap.microk8s.daemon-cluster-agent is running Service snap.microk8s.daemon-containerd is running Service snap.microk8s.daemon-kubelite is running Service snap.microk8s.daemon-k8s-dqlite is running Service snap.microk8s.daemon-apiserver-kicker is running Copy service arguments to the final report tarball Inspecting AppArmor configuration Gathering system information Copy processes list to the final report tarball Copy disk usage information to the final report tarball Copy memory usage information to the final report tarball Copy server uptime to the final report tarball Copy openSSL information to the final report tarball Copy snap list to the final report tarball Copy VM name (or none) to the final report tarball Copy current linux distribution to the final report tarball Copy asnycio usage and limits to the final report tarball Copy inotify max_user_instances and max_user_watches to the final report tarball Copy network configuration to the final report tarball Inspecting kubernetes cluster Inspect kubernetes cluster Inspecting dqlite Inspect dqlite cp: cannot stat '/var/snap/microk8s/6668/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball Report tarball is at /var/snap/microk8s/6668/inspection-report-20240406_123559.tar.gz
System info:
RETTY_NAME="Ubuntu 22.04 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04 (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
When I installed version 1.29 using snap install microk8s --classic --channel=1.29/stable
and run inspect
microk8s inspect Inspecting system Inspecting Certificates Inspecting services Service snap.microk8s.daemon-cluster-agent is running Service snap.microk8s.daemon-containerd is running Service snap.microk8s.daemon-kubelite is running Service snap.microk8s.daemon-k8s-dqlite is running Service snap.microk8s.daemon-apiserver-kicker is running Copy service arguments to the final report tarball Inspecting AppArmor configuration Gathering system information Copy processes list to the final report tarball Copy disk usage information to the final report tarball Copy memory usage information to the final report tarball Copy server uptime to the final report tarball Copy openSSL information to the final report tarball Copy snap list to the final report tarball Copy VM name (or none) to the final report tarball Copy current linux distribution to the final report tarball Copy asnycio usage and limits to the final report tarball Copy inotify max_user_instances and max_user_watches to the final report tarball Copy network configuration to the final report tarball Inspecting kubernetes cluster Inspect kubernetes cluster Inspecting dqlite Inspect dqlite cp: cannot stat '/var/snap/microk8s/6641/var/kubernetes/backend/localnode.yaml': No such file or directory
WARNING: Maximum number of inotify user watches is less than the recommended value of 1048576. Increase the limit with: echo fs.inotify.max_user_watches=1048576 | sudo tee -a /etc/sysctl.conf sudo sysctl --system
Got above error.
Solution:
snap remove --purge microk8s
snap install microk8s --classic --channel=1.28/stable
Ran into this same issue with cp: cannot stat '/var/snap/microk8s/6370/var/kubernetes/backend/localnode.yaml': No such file or directory
and "Failed to start ContainerManager" err="failed to initialize top level QOS containers: root container [kubepods] doesn't exist"
errors. The latter being my google search.
This happened on a fresh 22.04.04 ubuntu server minimal installation having only done a apt upgrade
. It looks like the default version installed of microk8s was 1.29/stable (selected during the server install) and resulted in the above errors.
Fix for me was to roll back to 1.28 ie
sudo snap remove --purge microk8s
sudo snap install microk8s --classic --channel=1.28/stable
This let me start back up the node. Subsequently I upgraded to latest:
sudo snap refresh microk8s --channel 1.30/stable
and rejoined the node to the cluster.
microk8s join 192.168.x.x:25000/xxxxxxxxxxxxxxxxxxxxxxxxxx/xxxxxxxxx
Everything seems to be in order.
Hi @Nospamas and all, this seems to have started on Debian, but currently affecting Ubuntu versions as well. This is related to the kubepods cgroup not getting the cpuset
controller up on 1.29 and 1.30.
We have a fix #4503 that is out on 1.29/edge
and 1.30/edge
channels, and will shortly find its way on 1.29/stable
and 1.30/stable
respectively. So, if people are currently experiencing issues, I would recommend:
# switch to 1.30/edge channel if running 1.30
sudo snap refresh microk8s --channel 1.30/edge
# switch to 1.29/edge channel if running 1.29
sudo snap refresh microk8s --channel 1.29/edge
The issue will remain open until the bugfix is promoted to stable.
Same issue on raspberry PI 5
Linux pi1 6.6.28+rpt-rpi-2712 #1 SMP PREEMPT Debian 1:6.6.28-1+rpt1 (2024-04-22) aarch64 GNU/Linux
$ microk8s.inspect Inspecting system Inspecting Certificates Inspecting services Service snap.microk8s.daemon-cluster-agent is running Service snap.microk8s.daemon-containerd is running Service snap.microk8s.daemon-kubelite is running Service snap.microk8s.daemon-k8s-dqlite is running Service snap.microk8s.daemon-apiserver-kicker is running Copy service arguments to the final report tarball Inspecting AppArmor configuration Gathering system information Copy processes list to the final report tarball Copy disk usage information to the final report tarball Copy memory usage information to the final report tarball Copy server uptime to the final report tarball Copy openSSL information to the final report tarball Copy snap list to the final report tarball Copy VM name (or none) to the final report tarball Copy current linux distribution to the final report tarball Copy asnycio usage and limits to the final report tarball Copy inotify max_user_instances and max_user_watches to the final report tarball Copy network configuration to the final report tarball Inspecting kubernetes cluster Inspect kubernetes cluster Inspecting dqlite Inspect dqlite cp: cannot stat '/var/snap/microk8s/6799/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball Report tarball is at /var/snap/microk8s/6799/inspection-report-20240506_095902.tar.gz
inspection-report-20240506_095631.tar.gz inspection-report-20240506_095902.tar.gz
Hi @Nospamas and all, this seems to have started on Debian, but currently affecting Ubuntu versions as well. This is related to the kubepods cgroup not getting the
cpuset
controller up on 1.29 and 1.30.We have a fix #4503 that is out on
1.29/edge
and1.30/edge
channels, and will shortly find its way on1.29/stable
and1.30/stable
respectively. So, if people are currently experiencing issues, I would recommend:# switch to 1.30/edge channel if running 1.30 sudo snap refresh microk8s --channel 1.30/edge # switch to 1.29/edge channel if running 1.29 sudo snap refresh microk8s --channel 1.29/edge
The issue will remain open until the bugfix is promoted to stable.
Are we able to switch back to stable once the bug is fixed? or it's best to use version 1.28?
I hate microk8s
Hi,
I have the same issue on my Ubuntu 22.04 I installed version 1.29/edge but microk8s not running with error
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite
cp: cannot stat '/var/snap/microk8s/6887/var/kubernetes/backend/localnode.yaml': No such file or directory
Building the report tarball
Report tarball is at /var/snap/microk8s/6887/inspection-report-20240606_145417.tar.gz
report file inspection-report-20240606_145417.tar.gz
k3s is life!
Hi, same issue here with Ubuntu 22.04 with both 1.29/edge and 1.30/edge
Refreshing with 1.28/stable seems to work
Fix for me was to roll back to 1.28 ie
sudo snap remove --purge microk8s sudo snap install microk8s --classic --channel=1.28/stable
This let me start back up the node. Subsequently I upgraded to latest:
sudo snap refresh microk8s --channel 1.30/stable
and I no longer get:
cp: cannot stat '/var/snap/microk8s/6887/var/kubernetes/backend/localnode.yaml': No such file or directory
But I still have the microk8s is not running
$ microk8s status
microk8s is not running. Use microk8s inspect for a deeper inspection.
$ microk8s inspect
Inspecting system
Inspecting Certificates
Inspecting services
Service snap.microk8s.daemon-cluster-agent is running
Service snap.microk8s.daemon-containerd is running
Service snap.microk8s.daemon-kubelite is running
Service snap.microk8s.daemon-k8s-dqlite is running
Service snap.microk8s.daemon-apiserver-kicker is running
Copy service arguments to the final report tarball
Inspecting AppArmor configuration
Gathering system information
Copy processes list to the final report tarball
Copy disk usage information to the final report tarball
Copy memory usage information to the final report tarball
Copy server uptime to the final report tarball
Copy openSSL information to the final report tarball
Copy snap list to the final report tarball
Copy VM name (or none) to the final report tarball
Copy current linux distribution to the final report tarball
Copy asnycio usage and limits to the final report tarball
Copy inotify max_user_instances and max_user_watches to the final report tarball
Copy network configuration to the final report tarball
Inspecting kubernetes cluster
Inspect kubernetes cluster
Inspecting dqlite
Inspect dqlite
Building the report tarball
Report tarball is at /var/snap/microk8s/6946/inspection-report-20240619_233116.tar.gz
it fixed my issues i have successfully created a configuration file named localnode.yaml under the missing file directory. The contents of the file specify a Kubernetes ConfigMap with the following details:
apiVersion: v1 kind: ConfigMap metadata: name: localnode-config namespace: kube-system data: address: 192.168.1.100:19001 role: node
This configuration sets the address to 192.168.1.100:19001 and assigns the role as a node Purpose of localnode.yaml: The localnode.yaml file defines configuration for a single-node Kubernetes cluster. [It specifies details like the node’s IP address, port, and role (e.g., as a regular node) within the cluster] MicroK8s uses this configuration to set up the local Kubernetes environment.
Hi @Nospamas and all, this seems to have started on Debian, but currently affecting Ubuntu versions as well. This is related to the kubepods cgroup not getting the
cpuset
controller up on 1.29 and 1.30. We have a fix #4503 that is out on1.29/edge
and1.30/edge
channels, and will shortly find its way on1.29/stable
and1.30/stable
respectively. So, if people are currently experiencing issues, I would recommend:# switch to 1.30/edge channel if running 1.30 sudo snap refresh microk8s --channel 1.30/edge # switch to 1.29/edge channel if running 1.29 sudo snap refresh microk8s --channel 1.29/edge
The issue will remain open until the bugfix is promoted to stable.
Are we able to switch back to stable once the bug is fixed? or it's best to use version 1.28?
not working, all wsl ubuntu 20.04, 22.04, 24.04 gave cp: cannot stat '/var/snap/microk8s/****/var/kubernetes/backend/localnode.yaml': No such file or directory