kind
kind copied to clipboard
Running with rootless podman doesn't work as documented
What happened: I tried running kind with rootless podman and followed the documentation. But this didn't work
What you expected to happen: That it would work
How to reproduce it (as minimally and precisely as possible): Install and configure podman for rootless, install kind. Use a terminal with systemd-scopes, like gnome-terminal. Use an OS that doesn't Delegate everything, like Arch Linux. (Seems to be done on Fedora, https://gitlab.gnome.org/GNOME/gnome-terminal/-/issues/7914#note_1523590)
Anything else we need to know?: @benzea stated in this ticket, that tools that depend on cgroups like kind does, should wrap themselves in either a unit or switch to a different scope themselves. (https://gitlab.gnome.org/GNOME/gnome-terminal/-/issues/7914#note_1523646)
Environment:
- kind version: (use
kind version): 0.14.0 - Kubernetes version: (use
kubectl version): 1.24.3 - Docker version: (use
docker info):command not found😉 - Podman version: (use
podman version): 4.1.1 - OS (e.g. from
/etc/os-release): Arch Linux
Docker version: (use docker info): command not found 😉
This is not helpful … podman attempts to be docker compatible and contains docker sub commands like podman info. This command is rich with host environment debug info.
I wonder if the kind authors just never got the message that units are now put into app.slice by default, and delegation need to be also enabled in intermediate slices. And if they are e.g. on Fedora, that is done by default ...
rootless podman support is contributed by @AkihiroSuda, nominally kind is developed for rootful docker.
Akihiro contributed some fedora based CI for rootless docker and podman, which does happen to be fedora based.
This is the first we’ve encountered a rootless podman user that wasn’t on fedora, as you can tell. Most of our users use docker, which has mature support. Podman support is experimental (the tool should be printing a warning when you use podman) and is fundamentally limited by some of the significant differences where it is not drop in compatible, we have entirely separate code paths for podman behavior.
No idea what exactly your problem is. But if something wants to use cgroups, it really should run in its own systemd unit and enable delegation by setting the appropriate options there. As you know, that is possible to do by writing the .service file or using systemd-run --user.
The kind process doesn’t touch cgroups. Our tool invokes docker or podman to spawn a container, inside the container we do touch cgroups.
Host cgroups configuration is not something we currently plan to touch from the kind process, for example podman we’re invoking may actually be talking to a remote instance anyhow and it’s difficult to detect reliably (see discussions linked to #2233)
Docker version: (use docker info): command not found 😉
This is not helpful … podman attempts to be docker compatible and contains docker sub commands like podman info. This command is rich with host environment debug info.
I thought you only needed the version, podman info:
host:
arch: amd64
buildahVersion: 1.26.1
cgroupControllers:
- io
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
conmon:
package: /usr/bin/conmon is owned by conmon 1:2.1.3-1
path: /usr/bin/conmon
version: 'conmon version 2.1.3, commit: ab52a597278b20173440140cd810dc9fa8785c93'
cpuUtilization:
idlePercent: 68.52
systemPercent: 11.05
userPercent: 20.44
cpus: 16
distribution:
distribution: arch
version: unknown
eventLogger: journald
hostname: steve
idMappings:
gidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 10000
size: 65536
uidmap:
- container_id: 0
host_id: 1000
size: 1
- container_id: 1
host_id: 10000
size: 65536
kernel: 5.18.16-zen1-1-zen
linkmode: dynamic
logDriver: journald
memFree: 3378966528
memTotal: 33353601024
networkBackend: netavark
ociRuntime:
name: crun
package: /usr/bin/crun is owned by crun 1.5-1
path: /usr/bin/crun
version: |-
crun version 1.5
commit: 54ebb8ca8bf7e6ddae2eb919f5b82d1d96863dea
spec: 1.0.0
+SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
os: linux
remoteSocket:
exists: true
path: /run/user/1000/podman/podman.sock
security:
apparmorEnabled: false
capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
rootless: true
seccompEnabled: true
seccompProfilePath: /etc/containers/seccomp.json
selinuxEnabled: false
serviceIsRemote: false
slirp4netns:
executable: /usr/bin/slirp4netns
package: /usr/bin/slirp4netns is owned by slirp4netns 1.2.0-1
version: |-
slirp4netns version 1.2.0
commit: 656041d45cfca7a4176f6b7eed9e4fe6c11e8383
libslirp: 4.7.0
SLIRP_CONFIG_VERSION_MAX: 4
libseccomp: 2.5.4
swapFree: 0
swapTotal: 0
uptime: 23h 32m 1.98s (Approximately 0.96 days)
plugins:
log:
- k8s-file
- none
- passthrough
- journald
network:
- bridge
- macvlan
volume:
- local
registries:
search:
- docker.io
- hub.4allportal.net
store:
configFile: /home/cwr/.config/containers/storage.conf
containerStore:
number: 0
paused: 0
running: 0
stopped: 0
graphDriverName: btrfs
graphOptions: {}
graphRoot: /home/cwr/.local/share/containers/storage
graphRootAllocated: 1023691194368
graphRootUsed: 481339727872
graphStatus:
Build Version: Btrfs v5.18.1
Library Version: "102"
imageCopyTmpDir: /var/tmp
imageStore:
number: 0
runRoot: /run/user/1000/containers
volumePath: /home/cwr/.local/share/containers/storage/volumes
version:
APIVersion: 4.1.1
Built: 1659559968
BuiltTime: Wed Aug 3 22:52:48 2022
GitCommit: f73d8f8875c2be7cd2049094c29aff90b1150241-dirty
GoVersion: go1.19
Os: linux
OsArch: linux/amd64
Version: 4.1.1
I wonder if the kind authors just never got the message that units are now put into app.slice by default, and delegation need to be also enabled in intermediate slices. And if they are e.g. on Fedora, that is done by default ...
rootless podman support is contributed by @AkihiroSuda, nominally kind is developed for rootful docker.
Akihiro contributed some fedora based CI for rootless docker and podman, which does happen to be fedora based.
This is the first we’ve encountered a rootless podman user that wasn’t on fedora, as you can tell. Most of our users use docker, which has mature support. Podman support is experimental (the tool should be printing a warning when you use podman) and is fundamentally limited by some of the significant differences where it is not drop in compatible, we have entirely separate code paths for podman behavior.
Huh, interesting, I thought podman would be more widespread 😅 I've been using it for years now and only on Arch Linux
No idea what exactly your problem is. But if something wants to use cgroups, it really should run in its own systemd unit and enable delegation by setting the appropriate options there. As you know, that is possible to do by writing the .service file or using systemd-run --user.
The kind process doesn’t touch cgroups. Our tool invokes docker or podman to spawn a container, inside the container we do touch cgroups.
Mh, then I guess I'll have to continue to run kind in its own scope.
This is the first we’ve encountered a rootless podman user that wasn’t on fedora, as you can tell. Most of our users use docker, which has mature support. Podman support is experimental (the tool should be printing a warning when you use podman) and is fundamentally limited by some of the significant differences where it is not drop in compatible, we have entirely separate code paths for podman behavior.
I am also using kind on Arch/Garuda, and we have encountered each other before @BenTheElder. In fact, I remember raising an issue about this, and giving up; #2684. Making this issue a duplicate of mine. @maciekmm is also on Arch and uses rootless, and he just made issue a few days ago.
Podman is FOSS, and in the spirit of kind also being FOSS, it is my humble opinion that podman should have a little more attention. I completely understand that it requires a lot of time, and have endless respect for the work the kind team has done; this is my code of ethics, not yours (no harm if you don't want to do it). kind has been very helpful to me, again, thank you.
Please make kind distro and podman/docker agnostic. There aren't that many: Arch, Fedora, RedHat, and Debian (off the top of my head).
This is not helpful … podman attempts to be docker compatible and contains docker sub commands like podman info. This command is rich with host environment debug info.
In fairness, I think he was trying to be funny.
- Does it work with KDE or XFCE?
- Does it work with Rootless Docker?
- Can be workaround-ed with systemd-run ? If so, could you open a PR to update the docs?
In fairness, I think he was trying to be funny.
Actually, I interpreted Docker version: literally and thought docker info was just a suggestion on how to get the version.
Even if I had docker I would have still just inserted the version there 😅
Does it work with KDE or XFCE?
What do you mean by that? I would say it's not coupled to anything DE related.
Does it work with Rootless Docker?
I will try this tomorrow 👍
Can be workaround-ed with systemd-run ? If so, could you open a PR to update the docs?
Yeah, you can wrap the kind call in systemd-run;
systemd-run --user --scope --property=Delegate=yes kind create cluster
If that's the recommended way to run kind with rootless podman, then yes, I can open a PR
Actually, I interpreted Docker version: literally and thought docker info was just a suggestion on how to get the version.
We should clarify the template, this command provides necessary debug info.
Podman is FOSS, and in the spirit of kind also being FOSS, it is my humble opinion that podman should have a little more attention. I completely understand that it requires a lot of time, and have endless respect for the work the kind team has done; this is my code of ethics, not yours (no harm if you don't want to do it). kind has been very helpful to me, again, thank you.
Please make kind distro and podman/docker agnostic. There aren't that many: Arch, Fedora, RedHat, and Debian (off the top of my head).
This is not about ethics and podman receives a lot of attention.
There are not simply "Arch, Fedora, RedHat, and Debian", there are infinite linux distros and configurations, new ones are created every day. We cannot support all of them equally (and RHEL requires a license ...). We have limited time and resources to run them locally and in CI. So far that means podman and rootless are tested primarily in CI on Fedora, we already have a large CI matrix for this small project.
KIND is already supporting podman to a more than reasonable extent, at a relatively outsized cost. Unlike docker, podman does not provide a stable, mature interface. Both are FOSS.
It is mostly docker compatible except when it isn't, which is fine, we've already developed separate implementations to support podman and set up CI ... However podman also makes breaking changes against it's own behavior.
Off the top of my head: https://github.com/kubernetes-sigs/kind/pull/2257, https://github.com/kubernetes-sigs/kind/issues/2085#issuecomment-784804927, ...
Docker has made exactly one small breaking change for the duration of this project (#2046) even though it has been supported for far longer.
We support podman anyhow, even though the primary purpose of this project is to develop Kubernetes (see: https://kind.sigs.k8s.io/docs/contributing/project-scope/) and Kubernetes requires docker, not podman to develop because it leverages buildx for multi-arch and all those users therefore must have docker installed.
There are additional limitations to using podman (mainly around restart support) because podman simply does not handle these things due to difference in approach.
Please remember that @aojea and I are already lending our free time or cutting into work time to support this and we could instead be improving / fixing Kubernetes (which relates to our actual current day jobs) or shipping a new KIND release at impact to far more users.
I have recently gone way out of my way to prevent rootless from being broken in particular (https://github.com/kubernetes-sigs/kind/pull/2846, https://github.com/kubernetes/enhancements/issues/361#issuecomment-1172435157 and the less visible work from myself and others meeting to find a ship and last minute fix to Kubernetes) because I do care about our users, but my time is bounded.
I will review PRs to fix this, but it's simply not a priority for me to debug rootless podman x Arch ... Kubernetes only works fully on rootful and docker is a perfectly acceptable free and open source alternative to podman, I don't personally use Arch and I cannot run it on my employer-provided machines.
Another consideration here: Projects like Kubernetes, podman, docker, runc, containerd, etc. also only run CI or develop for a limited set of environments, so we carry a higher cost to keep these things working together because the things we're integrating with are not developed or tested in these ways so as in https://github.com/kubernetes/enhancements/issues/361#issuecomment-1172435157 we have to turn around and proactively convince them to support these things and fix them to enable to support in KIND.
I'll go a step further and say I'm willing to write docs changes or code patches to fix these environments myself if we receive sufficient information about how to fix them, but we're not going to stretch our CI matrix even further or locally develop on additional environments. It's already a lot.
Uh, but if podman is the one needing Delegation, then shouldn't podman have the corresponding configuration and documentation on how to get it up and running (which can then be directly linked by kind).
Uh, but if podman is the one needing Delegation, then shouldn't podman have the corresponding configuration and documentation on how to get it up and running (which can then be directly linked by kind).
The documentation is already there: https://kind.sigs.k8s.io/docs/user/rootless/ https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md
It's comprehensive, and I'd go as far as saying that these two combined form a complete guide on how to run kind on rootless podman/docker.
On top of that kind already links to that document if it detects missing Delegate.
https://github.com/kubernetes-sigs/kind/pull/2981 adds a hint for systemd-run --scope --user kind create cluster, thanks @VannTen
https://github.com/kubernetes-sigs/kind/pull/3032 will clarify the bug template re: docker info / podman info.
#2981 adds a hint for
systemd-run --scope --user kind create cluster, thanks @VannTen
Took me a while to find this issue, I followed the hint and did all the cgroups v2 checks along with adding the /etc/systemd/system/[email protected]/delegate.conf file... none of it worked, sadly.
On my system (Ubuntu 22.04 LTS + Podman 4.3.1) that hint doesn't work ~, whereas @cwrau's method does not error out, though I can't seem to find any created clusters~ (see update):
$ systemd-run --scope --user kind create cluster
Running scope as unit: run-r39bac831e38c4a4fad72f425230b9030.scope
enabling experimental podman provider
ERROR: failed to create cluster: running kind with rootless provider requires setting systemd property "Delegate=yes", see https://kind.sigs.k8s.io/docs/user/rootless/
$ systemd-run --user --property=Delegate=yes kind create cluster
Running as unit: run-r7137d1a0db4f46a5a1c6d6fbcf7225eb.service
$ kind get clusters
enabling experimental podman provider
No kind clusters found.
$ systemd-run --scope --user kind get clusters
Running scope as unit: run-r923df78bf2bc4b2bae955892be078d3c.scope
enabling experimental podman provider
No kind clusters found.
Is there a corresponding issue in podman's issues that links to this?
Update: Upon further inspection, @cwrau's invocation fails exactly the same way, except the error gets dumped in journalctl -f --user instead of stderr...
an 12 23:35:13 razer-neon systemd[841]: Started /usr/local/bin/kind create cluster.
Jan 12 23:35:13 razer-neon kind[9015]: enabling experimental podman provider
Jan 12 23:35:13 razer-neon kind[9015]: ERROR: failed to create cluster: running kind with rootless provider requires setting systemd property "Delegate=yes", see https://kind.sigs.k8s.io/docs/user/rootless/
Jan 12 23:35:13 razer-neon systemd[841]: run-r85f05a467a5b4292b8af42fbeda81917.service: Main process exited, code=exited, status=1/FAILURE
Jan 12 23:35:13 razer-neon systemd[841]: run-r85f05a467a5b4292b8af42fbeda81917.service: Failed with result 'exit-code'.
Update: Upon further inspection, @cwrau's invocation fails exactly the same way, except the error gets dumped in journalctl -f --user instead of stderr...
You can run it with systemd-run --user >--scope< --property=Delegate=yes kind create cluster to run it synchronously and with direct output, I adjusted my above comment.
I jut wanted to try this again to check if it's working on my end, but I was getting different errors;
λ sru --property=Delegate=yes --scope kind create cluster
Running scope as unit: run-r8d8a8ba153014351b0e1c5199d4f5edc.scope
enabling experimental podman provider
Creating cluster "kind" ...
ERROR: failed to create cluster: failed to ensure podman network: command "podman network create -d=bridge --ipv6 --subnet fc00:f853:ccd:e793::/64 kind" failed with error: exit status 125
Command Output: Error: could not find free subnet from subnet pools
I fixed that by adding {"base" = "11.0.0.0/24", "size" = 24} as an additional subnet_pool in my containers.conf;
[network]
default_subnet_pools = [
{"base" = "11.0.0.0/24", "size" = 24},
{"base" = "10.89.0.0/16", "size" = 24},
{"base" = "10.90.0.0/15", "size" = 24},
{"base" = "10.92.0.0/14", "size" = 24},
{"base" = "10.96.0.0/11", "size" = 24},
{"base" = "10.128.0.0/9", "size" = 24},
]
Then I was getting the following error;
λ sru --property=Delegate=yes --scope kind create cluster
Running scope as unit: run-rdce61f37b2fe4c978281deff3e2c6697.scope
enabling experimental podman provider
Creating cluster "kind" ...
✓ Ensuring node image (kindest/node:v1.25.3) 🖼
✗ Preparing nodes 📦
ERROR: failed to create cluster: command "podman run --name kind-control-plane --hostname kind-control-plane --label io.x-k8s.kind.role=control-plane --privileged --tmpfs /tmp --tmpfs /run --volume bf13142d953a4c24f351bff1f96bbbd0e82381cc93edb6f49b475e8abc5da707:/var:suid,exec,dev --volume /lib/modules:/lib/modules:ro -e KIND_EXPERIMENTAL_CONTAINERD_SNAPSHOTTER --detach --tty --net kind --label io.x-k8s.kind.cluster=kind -e container=podman --volume /dev/mapper:/dev/mapper --device /dev/fuse --publish=127.0.0.1:35889:6443/tcp -e KUBECONFIG=/etc/kubernetes/admin.conf docker.io/kindest/node@sha256:f52781bc0d7a19fb6c405c2af83abfeb311f130707a0e219175677e366cc45d1" failed with error: exit status 126
Command Output: time="2023-01-13T12:03:15+01:00" level=warning msg="aardvark-dns binary not found, container dns will not be enabled"
Error: netavark: code: 3, msg: modprobe: ERROR: could not insert 'ip6_tables': Operation not permitted
ip6tables v1.8.8 (legacy): can't initialize ip6tables table `nat': Table does not exist (do you need to insmod?)
Perhaps ip6tables or your kernel needs to be upgraded.
Which I fixed by running sudo modprobe ip6_tables
After that it's working 😁