kind
kind copied to clipboard
Failed to start ContainerManager" err="failed to get rootfs info: failed to get mount point for device..."
I'm getting the error as desrivd in known issues (https://kind.sigs.k8s.io/docs/user/known-issues/) but the creating and using the cluster config file did not change anything:
Jan 05 23:26:32 kind-control-plane kubelet[1763]: E0105 23:26:32.106420 1763 kubelet.go:1649] "Failed to start ContainerManager" err="failed to get rootfs info: failed to get mount point for device "/dev/nvme0n1p2 ": no partition info for device "/dev/nvme0n1p2"" Jan 05 23:26:32 kind-control-plane systemd[1]: kubelet.service: Main process exited, code=exited, status=1/FAILURE
My cluster yaml looks this way, the partition has file system F2FS:
(starting it with kind create cluster --config ~/.kind/cluster.yaml)
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /dev/nvme0n1p2
containerPath: /dev/nvme0n1p2
propagation: HostToContainer
kind version:
kind v0.26.0 go1.23.4 linux/amd64
docker version:
Client:
Version: 26.1.0
API version: 1.45
Go version: go1.23.1
Git commit: 9714adc6c797755f63053726c56bc1c17c0c9204
Built: Sun Dec 8 21:43:42 2024
OS/Arch: linux/amd64
Context: default
Server:
Engine:
Version: 26.1.0
API version: 1.45 (minimum version 1.24)
Go version: go1.23.3
Git commit: 061aa95809be396a6b5542618d8a34b02a21ff77
Built: Thu Dec 12 15:02:12 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: v1.7.15
GitCommit: 926c9586fe4a6236699318391cd44976a98e31f1
runc:
Version: 1.1.12
GitCommit: 51d5e94601ceffbbd85688df1c928ecccbfa4685
docker-init:
Version: 0.19.0
GitCommit: de40ad007797e0dcd8b7126f27bb87401d224240
Is there something else I should check or another workaround?
I think we need to know a little more about your environment. Can you include the output from docker info?
You can also run kind create cluster --config ~/.kind/cluster.yaml --retain to keep the node container around after failure to inspect it for config issues or look for log messages by exec'ing in and running commands. You can also do kind export logs to collect up the various logs of interest from the node.
the partition has file system F2FS
not familiar with this one, but using a more common eg etx4 partition will probably fix it.
docker info:
Client:
Version: 26.1.0
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: 0.14.0
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.28.1
Path: /usr/libexec/docker/cli-plugins/docker-compose
Server:
Containers: 6
Running: 3
Paused: 0
Stopped: 3
Images: 31
Server Version: 26.1.0
Storage Driver: overlay2
Backing Filesystem: f2fs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: inactive
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 926c9586fe4a6236699318391cd44976a98e31f1
runc version: 51d5e94601ceffbbd85688df1c928ecccbfa4685
init version: de40ad007797e0dcd8b7126f27bb87401d224240
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 6.6.62-gentoo-dist
Operating System: Gentoo Linux
OSType: linux
Architecture: x86_64
CPUs: 24
Total Memory: 188.5GiB
Name: shodan
ID: DLAE:EMXQ:UF4S:N7LR:JXGR:V5YJ:RLBU:FDIJ:C6FZ:C3X5:F7NM:HU5M
Docker Root Dir: /var/lib/docker
Debug Mode: false
Username: rnnr
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
the partition has file system F2FS
not familiar with this one, but using a more common eg etx4 partition will probably fix it.
My rootfs is on this and kind wants to know about rootfs, or no? How should I change the disk it uses - the cluster config file seems to be ignored.
You can also run
kind create cluster --config ~/.kind/cluster.yaml --retainto keep the node container around after failure to inspect it for config issues or look for log messages by exec'ing in and running commands. You can also dokind export logsto collect up the various logs of interest from the node.
I did this, or maybe withou the --retain but it does not matter, as the message I posted initially is repeatedly in kubelet.log and it kind create cluster gets stuck on it for a while.
Attaching the whole file - I will provide any other of the, I just do not want to flood it here with useless data, so please guide me. kubelet.log
My rootfs is on this and kind wants to know about rootfs, or no?
kubelet is looking for stats, but from it's POV the "rootfs" will be whatever the storage for the "node" container is on.
The logs from kubelet don't make sense in this context because it's expected to be running directly on a "real" host (machine, VM), not in a container (which is not technically supported upstream)
So the rootfs in this case would be whatever filesystem docker's data root is on with your volumes and containers.
This code is not in kind, and the filesystem stats need to work inside the container.
How should I change the disk it uses - the cluster config file seems to be ignored.
https://docs.docker.com/engine/daemon/#daemon-data-directory
In theory we'd like kind to work with all of these, but in practice the container ecosystem is most well tested with ext4, possibly a few others, but definitely not all filesystems (and most of the relevant code is not in kind).
In the future instead of cadvisor the stats may be in kubelet and CRI (containerd here).
See also: https://github.com/kubernetes-sigs/kind/pull/1464/files (not sure if this sort of thing is relevant for f2fs)
Thanks for the pointers, I'll look at it hopefully soon more closely. I appreciate the info, it's just there came some more pushing things.
See also: https://github.com/kubernetes-sigs/kind/pull/1464/files (not sure if this sort of thing is relevant for f2fs)
I've checked the code. Not sure how is the function mountDevMapper supposed to be used, but the command docker info -f "{{.Driver}}" I see it callse returns "overlay2" on my machine, co the function would return false.
Yes, we have no attempt to support F2FS specifically (and I'm not sure what is necessary for it), but you could try manually configuring the equivalent /dev/mapper mount on the off chance we have the same problem here.
https://kind.sigs.k8s.io/docs/user/configuration/#extra-mounts
TBH, its unclear why kind even cares about the baking fs. But here is a small workaround for those who face this issue:
- Create an ext4 file system in a file:
sudo truncate --size=10G /home/docker.img && sudo mkfs.ext4 /home/docker.img - Mount it, fstab:
/home/docker.img /home/docker ext4 rw,noatime,nodiratime 0 0 - Update docker settings to use /home/docker at it's data-root:
{
"data-root": "/home/docker"
}
I ran into this issue when using kind in a kata guest hostPath mounting /var/lib/kubelet from the host into the guest.
The cadvisor code here will fail, as even when the host block devices are mounted into the guest, they'll have a different major:minor number.
An easy way to get kubelet to start is to turn off localStorageCapacityIsolation in the kubelet config.
- see that the code is only called when
localStorageCapacityIsolationis used: https://github.com/kubernetes/kubernetes/blob/854e67bb51e177b4b9c012928d8271704e9cb80d/pkg/kubelet/cm/container_manager_linux.go#L645
TBH, its unclear why kind even cares about the baking fs. But here is a small workaround for those who face this issue:
It doesn't directly, but the container runtime (containerd inside the nodes) and kubelet absolutely do, they need to track filesystem stats and run overlay.
kind is aware of this when possible, in order to employ workarounds (such as using fuse-overlayfs), to enable containerd and kubelet.
In general, containers are sensitive to the backing filesystem. I recommend using common default filesystems from the ecosystem (e.g. ext4), because the kind project cannot be responsible for containerd, runc, podman, and so on having good support and performance for arbitrary filesystems.
An easy way to get kubelet to start is to turn off localStorageCapacityIsolation in the kubelet config.
This is, however, a GA functionality in Kubernetes. IIRC it is part of conformance. YMMV.
Yep, thanks for pointing that out. I use it for temp CI-type clusters so not an issue for me personally.