k3d
k3d copied to clipboard
[BUG] rootless docker -> k3d blocks forever (k3s boot loops)
What did you do
Baseline:
- Fedora 33
- cgroups v2
- provision Docker CE from the official Docker repo for rootless docker
- install rootless docker following https://docs.docker.com/engine/security/rootless/ and make sure to convince Fedora to use
fuse-overlayfs
viaecho '{"storage-driver": "fuse-overlayfs"}' > ~/.config/docker/daemon.json
k3d:
-
export USE_SUDO=false
-
export K3D_INSTALL_DIR=~/bin
(~/bin exists and is on the PATH) -
wget -q -O - https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash
(that's copy&paste)
- How was the cluster created?
-
k3d cluster create mycluster
(that's copy&paste)
-
Problem: Command hangs after having emitted
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-mycluster' (4f944e1b21bff3718107f3843216e9a69288b3579dce77377732a1417e82370f)
INFO[0000] Created volume 'k3d-mycluster-images'
INFO[0001] Creating node 'k3d-mycluster-server-0'
INFO[0001] Creating LoadBalancer 'k3d-mycluster-serverlb'
INFO[0001] Starting cluster 'mycluster'
INFO[0001] Starting servers...
INFO[0001] Starting Node 'k3d-mycluster-server-0'
After considerable time, it starts spewing
WARN[0204] Node 'k3d-mycluster-server-0' is restarting for more than a minute now. Possibly it will recover soon (e.g. when it's waiting to join). Consider using a creation timeout to avoid waiting forever in a Restart Loop.
which is somewhat understandable given that docker logs k3d-mycluster-server-0
is unhappy with
I0501 17:24:44.193897 7 plugins.go:158] Loaded 12 mutating admission controller(s) successfully in the following order: NamespaceLifecycle,LimitRanger,ServiceAccount,NodeRestriction,TaintNodesByCondition,Priority,DefaultTolerationSeconds,DefaultStorageClass,StorageObjectInUseProtection,RuntimeClass,DefaultIngressClass,MutatingAdmissionWebhook.
I0501 17:24:44.193931 7 plugins.go:161] Loaded 10 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,RuntimeClass,CertificateApproval,CertificateSigning,CertificateSubjectRestriction,ValidatingAdmissionWebhook,ResourceQuota.
time="2021-05-01T17:24:44.209114066Z" level=info msg="Running kube-scheduler --address=127.0.0.1 --bind-address=127.0.0.1 --kubeconfig=/var/lib/rancher/k3s/server/cred/scheduler.kubeconfig --leader-elect=false --port=10251 --profiling=false --secure-port=0"
time="2021-05-01T17:24:44.209273499Z" level=info msg="Waiting for API server to become available"
time="2021-05-01T17:24:44.209489318Z" level=info msg="Running kube-controller-manager --address=127.0.0.1 --allocate-node-cidrs=true --bind-address=127.0.0.1 --cluster-cidr=10.42.0.0/16 --cluster-signing-cert-file=/var/lib/rancher/k3s/server/tls/client-ca.crt --cluster-signing-key-file=/var/lib/rancher/k3s/server/tls/client-ca.key --configure-cloud-routes=false --controllers=*,-service,-route,-cloud-node-lifecycle --kubeconfig=/var/lib/rancher/k3s/server/cred/controller.kubeconfig --leader-elect=false --port=10252 --profiling=false --root-ca-file=/var/lib/rancher/k3s/server/tls/server-ca.crt --secure-port=0 --service-account-private-key-file=/var/lib/rancher/k3s/server/tls/service.key --use-service-account-credentials=true"
time="2021-05-01T17:24:44.211128448Z" level=info msg="Node token is available at /var/lib/rancher/k3s/server/token"
time="2021-05-01T17:24:44.211182001Z" level=info msg="To join node to cluster: k3s agent -s https://172.22.0.2:6443 -t ${NODE_TOKEN}"
time="2021-05-01T17:24:44.214298925Z" level=info msg="Wrote kubeconfig /output/kubeconfig.yaml"
time="2021-05-01T17:24:44.215290745Z" level=info msg="Run: k3s kubectl"
time="2021-05-01T17:24:44.215494947Z" level=fatal msg="failed to find cpu cgroup (v2)"
Note: I have not tried running k3s without the k3d wrapper (yet) - i.e. neither under root nor rootless.
From https://github.com/k3s-io/k3s/issues?q=is%3Aissue+is%3Aopen++rootless I cannot tell whether this is a k3s challenge or whether k3d driving k3s needs to be adapted?
Hi @shoffmeister , thanks for opening this issue! Interesting things you're doing here :wink: So there are several points to note here:
- you're on cgroupv2, which currently only works with a k3d "hotfix" (see #579) and still needs to be fixed in upstream k3s (see https://github.com/k3s-io/k3s/pull/3242).
- k3d always starts containers with
--privileged
- you have to tell k3s (inside k3d) to run rootless: `--k3s-server-arg "--rootless" --k3s-agent-arg "--rootless"
I am rather innocently naïve (AKA ruthless) when it comes to doing interesting things 😛 It's software after all, and it's running inside a VM, to top that off even more ;)
Many thanks for the input! I will revisit this issue here once the stars have aligned on the next versions of k3s, k3d.
I have taken good note of the explicit --rootless
into k3s.
https://rancher.com/docs/k3s/latest/en/advanced/#running-k3s-with-rootless-mode-experimental now documents steps for running k3s rootless (possibly as the result of https://github.com/k3s-io/k3s/pull/4086)
Alas, I am unable to translate the stern note
Don’t try to run k3s server --rootless on a terminal, as it doesn’t enable cgroup v2 delegation. If you really need to try it on a terminal, prepend systemd-run --user -p Delegate=yes --tty to create a systemd scope.
i.e., systemd-run --user -p Delegate=yes --tty k3s server --rootless
into something that would fit into the execution environment constructed by k3d (there is no systemd inside docker)
So, in trying to make progress on this issue here, I wonder whether it is possible at all to run k3s --rootless "inside" k3d on a rootless docker?
FWIW, I have yet to look into running k3s rootless proper.
- you have to tell k3s (inside k3d) to run rootless: `--k3s-server-arg "--rootless" --k3s-agent-arg "--rootless"
I don't see --k3s-server-arg
and --k3s-agent-arg
options for k3d cluster create
. Is running in rootless Docker now supported some other way? Given that there are instructions for rootless Podman, I assumed rootless Docker would work similarly.
I'm having problems with this too.
After enabling cpu / cpuset delegation (https://rootlesscontaine.rs/getting-started/common/cgroup2/#enabling-cpu-cpuset-and-io-delegation) I launched the cluster creation with:
k3d cluster create --k3s-arg "--rootless@server:0"
I got the following message in the log:
time="2023-03-21T08:43:13Z" level=fatal msg="expected sysctl value \"net.ipv4.ip_forward\" to be \"1\", got \"0\"; try adding \"net.ipv4.ip_forward=1\" to /etc/sysctl.conf and running
sudo sysctl --system"