kind
kind copied to clipboard
(Option to) create podman containers with --restart=always to restart cluster after reboots
What would you like to be added:
- Please create podman containers with
--restart=always. Landing this behind an option would be fine, too.
At this point, podman containers making up kind clusters have to be re-started manually after reboot with something like podman start --all.
Podman installations typically come with systemd services that stop all containers on system shutdown, but on system boot they only re-start containers with restart policy always (verified for podman on Ubuntu). Since that policy can only be controlled on container creation, the workarounds are rather cumbersome.
This looks somewhat related to #148.
Why is this needed:
- Managing kind clusters deployed to standalone systems would be significantly streamlined. A rebooted system running a kind cluster (on podman) would be able to come right back up again without user interaction.
Podman installations typically come with systemd services that stop all containers on system shutdown, but on system boot they only re-start containers with restart policy always (verified for podman on Ubuntu)
can you send a link to this? last time we checked it was not working
There are limitations, but generally it works as described in the podman documentation. The restart service is packaged (and enabled) in Ubuntu and supposedly any Debian-based distro, likely other distros as well.
Here's the relevant service excerpt on a Ubuntu mantic system as seen by # systemctl edit podman-restart.service:
[Service]
Type=oneshot
RemainAfterExit=true
Environment=LOGGING="--log-level=info"
ExecStart=/usr/bin/podman $LOGGING start --all --filter restart-policy=always
ExecStop=/bin/sh -c '/usr/bin/podman $LOGGING stop $(/usr/bin/podman container ls --filter restart-policy=always -q)'
Unfortunately, this only works for podman containers run by root, and currently stopping containers tends to run into timeouts on shutdown. Due to podman's design, even root doesn't see all user containers, so they won't get stopped or started by the systemd service which runs as root by default. It's not too difficult, though, to replicate it as a service running as kind user if running as root is not desired.
Always works poorly because on a failed cluster startup it will keep retrying indefinitely.
we don't use this policy with the docker runtime, we set one time restart. See the podman tracking issues.
Restart is also going to be problematic without DNS for node names
Points taken. I get reliable cluster reboots with a user crontab like
@reboot /usr/bin/podman start --all --filter 'restart-policy=always'
@reboot /usr/bin/podman start --all --filter 'name=kind-.*'
While I agree that it shouldn't be the default, if --restart=always was an option in kind, you could rely on the generic mechanism instead of requiring one crontab line per cluster.
If we can get restarts working reliably in podman, it will be the default, we don't want to add more knobs or worse container runtime specific knobs.
https://github.com/kubernetes-sigs/kind/issues/2272 is the existing tracking issue for podman.
a quick echo of the point @cr made. indeed for us a
podman start --all --filter 'name=<our-kind-cluster-name>'
does seem to restore kubernetes functionally after a podman machine restart. it would be nice if this could be generalized? while a kind create cluster -n foo results in a kubernetes context kind-foo, the podman container seems to be named just foo. this leaves us without a general breadcrumb on the container name that would tie it back to kind?
the podman container seems to be named just foo. this leaves us without a general breadcrumb on the container name that would tie it back to kind?
changing the naming scheme of the containers would break a LOT of stuff (aside: it shouldn't be foo but $name-$node_role$count), but it's already possible to identify cluster containers with kind get nodes -n foo
something like kind get nodes -n foo | xargs podman start ?
thanks @BenTheElder! i think kind get nodes -A | xargs -n1 podman start might be the general-purpose trick!