k3s
k3s copied to clipboard
systemd-shutdown hangs on containerd-shim when k3s-agent running
Environmental Info: K3s Version: k3s version v1.18.6+k3s1 (6f56fa1d)
Node(s) CPU architecture, OS, and Version: x86_64 Ubuntu 20.04.1 Linux nuc-linux3 5.4.0-48-generic #52-Ubuntu SMP Thu Sep 10 10:58:49 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration: 1 master 2 workers
Describe the bug: When shutting down or rebooting the node, the shutdown hangs for approximately 90 seconds. The console message is
systemd-shutdown: waiting for process: containerd-shim
When researching the problem I landed on this issue: https://github.com/drud/ddev/issues/2538#issuecomment-705079079 where they said when they uninstalled k3s the problem went away. I disabled and stopped k3s-agent.service and rebooted and the problem also went away for me.
I also tried re-enabling and starting k3s-agent.service and removing the docker.io package and running apt autoremove to remove containerd, runc, etc. but it still hangs on reboot at the same place.
Following https://github.com/containerd/containerd/issues/386#issuecomment-304837687 I changed the service configuration for k3s.agent and k3s-agent.service to KillMode=Mixed and that fixed the problem. This is in the standard Docker configuration.
However, I also found https://github.com/rancher/k3s/issues/1965 where it looks like this behavior is as intended. Is there a way to allow for upgrading k3s without disrupting workloads but at the same time not hang shutdowns/reboots for 90s?
I was thinking one way to do it is to use KillMode=mixed or KillMode=control-group by default for k3s{-agent}.service and when doing an upgrade, add a drop-in in /run/systemd/system/k3s{-agent}.service.d that temporarily sets KillMode=process before stopping the service, then removes the drop-in after the upgrade.
systemd has an explicit pre-shutdown hook, so perhaps you could invoke special logic with that. See:
/usr/lib/systemd/system/shutdown.target.wants
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.
Bump still relevant
bump same issue
notice the issue with raspberry pi with display on
I faced this issue yesterday and ended up with the following solution.
/etc/systemd/system/[email protected] :
[Unit]
Description=Kill cgroup procs on shutdown for %i
DefaultDependencies=false
Before=shutdown.target umount.target
[Service]
# Instanced units are not part of system.slice for some reason
# without this, the service isn't started at shutdown
Slice=system.slice
ExecStart=/bin/bash -c 'pids=$(cat /sys/fs/cgroup/unified/system.slice/%i/cgroup.procs); echo $pids | xargs -r kill;'
ExecStart=/bin/sleep 5
ExecStart=/bin/bash -c 'pids=$(cat /sys/fs/cgroup/unified/system.slice/%i/cgroup.procs); echo $pids | xargs -r kill -9;'
Type=oneshot
[Install]
WantedBy=shutdown.target
Enable the "service" for k3s-agent.service (will also work for k3s on the master ):
sudo systemctl enable [email protected]
# or, on the master: sudo systemctl enable [email protected]
I've written a long winding explanation here but in brief, what happens is that since killmode=process is used, all the container processes end up staying alive when k3s is brought down. Which is a good thing :tm:
However, during shutdown, systemd will signal all remaining processes and wait for DefaultTimeoutStopSec for them to die.
This is always 90s during the last shutdown phase with systemd v245.
It is a bug in systemd v245 shipped with ubuntu 20.04 and was fixed in september 2020
What I used to do was to set DefaultTimeoutStopSec=5s in /etc/systemd/system.conf and it worked fine, but on ubuntu 20.04 it doesn't.
Since there's little chance this fix will make it back into 20.04, the above "service" will perform round of SIGTERM, wait 5s, then proceed with SIGKILL to finish k3s's process cleanup during shutdown.
The sleep can be tweaked to suit your services need (something matching terminationGracePeriod perhaps)
Hope it helps.
Awesome research!
@jraby your solution helped me to resolve the issue, however I ended up using the k3s-killall.sh according to the k3s docs . With this there is no shutdown delay on my system.
Caution - this may not be what you want
The killall script cleans up containers, K3s directories, and networking components while also removing the iptables chain with all the associated rules. The cluster data will not be deleted.
I'm using this /etc/systemd/system/[email protected]
# source https://github.com/k3s-io/k3s/issues/2400#issuecomment-1013798094
# $ sudo systemctl enable [email protected]
[Unit]
Description=Kill cgroup procs on shutdown for %i
DefaultDependencies=false
Before=shutdown.target umount.target
[Service]
# Instanced units are not part of system.slice for some reason
# without this, the service isn't started at shutdown
Slice=system.slice
ExecStart=/bin/bash -c "/usr/local/bin/k3s-killall.sh"
Type=oneshot
[Install]
WantedBy=shutdown.target
This is on
Linux Mint 20.2 5.4.0-91-generic
the same problem exists on rke2 (no surprise, given its roots are in k3s)
Yes, this is by design. Stopping the K3s (or RKE2) service does not stop running containers. This is to allow for nondisruptive upgrades of the main K3s/RKE2 components by simply replacing the binary and restarting the service.
would you accept a feature request to add a systemd unit like https://github.com/k3s-io/k3s/issues/2400#issuecomment-1018472343 which only triggers on shutdown? This would both allow the intended behaviour of k3s/rke2 (seamless updates/restarts) and allow for a shutdown/reboot that's even quicker than RKE1.
here's my non-instanced version of that (for rke2):
[Unit]
Description=Kill containerd-shims on shutdown
DefaultDependencies=false
Before=shutdown.target umount.target
[Service]
ExecStart=/bin/bash -c "/usr/local/bin/rke2-killall.sh"
Type=oneshot
[Install]
WantedBy=shutdown.target
That might be a good thing to add to the documentation, for folks that want it?
Confirming this behaviour to be present with:
root@k3s:~# k3s --version
k3s version v1.23.3+k3s1 (5fb370e5)
go version go1.17.5
root@k3s:~# uname -a
Linux k3s 5.4.0-100-generic #113-Ubuntu SMP Thu Feb 3 18:43:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
@ciacon this is not version-specific behavior. As described at https://github.com/k3s-io/k3s/issues/2400#issuecomment-1040795816 by design, pods are not stopped when the k3s process exits.
in releases prior to 1.23.7 it was enough to add KillMode=mixed to /etc/systemd/system/k3s.service , and when system shutdown executed, k3s killed containers and computer was turned off imediately
For some reason unknown to me since 1.23.8 ---> up to current one 1.24.4 when doing so, it takes 90s again to shutdown system with k3s (... which is default systemctl timeout TimeoutStopUSec=1min 30s ... ), so KillMode mixed is ignored and k3s waits until timeout has passed to kill them ....
what has changed?
[Service]
Type=notify
EnvironmentFile=-/etc/default/%N
EnvironmentFile=-/etc/sysconfig/%N
EnvironmentFile=-/etc/systemd/system/k3s.service.env
KillMode=mixed
Probably something related to the containerd version change? I'm not sure, since changing the KillMode isn't something we test or support. I would recommend adding another unit that runs on shutdown, as described above.
Probably something related to the containerd version change? I'm not sure, since changing the KillMode isn't something we test or support. I would recommend adding another unit that runs on shutdown, as described above.
Thanks I have implemented shutdown unit as described by @horihel several days ago and so far it works great.
May I vote for adding this to official documentation @brandond ? I believe it is pretty common scenario, since k3s is ideal for edge deployments, and usually edge devices get much more shutdowns then servers usually do.
Here is a k3s version of https://github.com/k3s-io/k3s/issues/2400#issuecomment-1041165341:
[Unit]
Description=Kill containerd-shims on shutdown
DefaultDependencies=false
Before=shutdown.target umount.target
[Service]
ExecStart=/usr/local/bin/k3s-killall.sh
Type=oneshot
[Install]
WantedBy=shutdown.target
Put the file to /etc/systemd/system/shutdown-k3s.service and then enable the service using
systemctl enable shutdown-k3s.service
Also note that this service name shutdown-k3s shall not start with k3s-, otherwise the k3s-killall.sh script would try to stop it and cause problems.
I faced this issue yesterday and ended up with the following solution.
/etc/systemd/system/[email protected]:[Unit] Description=Kill cgroup procs on shutdown for %i DefaultDependencies=false Before=shutdown.target umount.target [Service] # Instanced units are not part of system.slice for some reason # without this, the service isn't started at shutdown Slice=system.slice ExecStart=/bin/bash -c 'pids=$(cat /sys/fs/cgroup/unified/system.slice/%i/cgroup.procs); echo $pids | xargs -r kill;' ExecStart=/bin/sleep 5 ExecStart=/bin/bash -c 'pids=$(cat /sys/fs/cgroup/unified/system.slice/%i/cgroup.procs); echo $pids | xargs -r kill -9;' Type=oneshot [Install] WantedBy=shutdown.targetEnable the "service" for
k3s-agent.service(will also work fork3son the master ):sudo systemctl enable [email protected] # or, on the master: sudo systemctl enable [email protected]I've written a long winding explanation here but in brief, what happens is that since
killmode=processis used, all the container processes end up staying alive when k3s is brought down. Which is a good thing tmHowever, during shutdown, systemd will signal all remaining processes and wait for
DefaultTimeoutStopSecfor them to die. This is always 90s during the last shutdown phase with systemd v245. It is a bug in systemd v245 shipped with ubuntu 20.04 and was fixed in september 2020What I used to do was to set
DefaultTimeoutStopSec=5sin/etc/systemd/system.confand it worked fine, but on ubuntu 20.04 it doesn't.Since there's little chance this fix will make it back into 20.04, the above "service" will perform round of SIGTERM, wait 5s, then proceed with SIGKILL to finish k3s's process cleanup during shutdown. The sleep can be tweaked to suit your services need (something matching
terminationGracePeriodperhaps)Hope it helps.
This will only work with unified cgroups though as for example I don't have /sys/fs/cgroup/unified/system.slice/ to begin with. :(
$ mount | grep group
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime)
none on /run/cilium/cgroupv2 type cgroup2 (rw,relatime)
Ubuntu 22.04
I had trouble getting a shutdown service to behave, but it turns out that was because I changed the [Install] section of the service and a systemctl daemon-reload if not enough to apply that change. You actually need to disable and enable the service to get systemd to update the symlinks to the new target.
Yes, good catch. You will need to adapt the example for agent nodes. The server and agent use different service names.
Beforeneeds to change tok3s-agent.serviceon agent nodes.
Unless of course one uses k3s ansible role which names them both as k3s.service. :)
Here is a k3s version of #2400 (comment):
[Unit] Description=Kill containerd-shims on shutdown DefaultDependencies=false Before=shutdown.target umount.target [Service] ExecStart=/usr/local/bin/k3s-killall.sh Type=oneshot [Install] WantedBy=shutdown.targetPut the file to
/etc/systemd/system/shutdown-k3s.serviceand then enable the service usingsystemctl enable shutdown-k3s.serviceAlso note that this service name
shutdown-k3sshall not start withk3s-, otherwise thek3s-killall.shscript would try to stop it and cause problems.
Can confirm this also works if you get the message A stop job is running for libcontainer...
Make sure to drain the node before shutdown, otherwise there will be data loss.
If you use the k3s ansible role you need to extract k3s-killall.sh from https://github.com/k3s-io/k3s/blob/d9f40d4f5b4776164322035499fabedea77f5f52/install.sh#L666-L743
Converting this issue into a discussion as this behavior is by design.