gvisor
gvisor copied to clipboard
Error: OCI runtime error: runsc: creating container: systemd error: Interactive authentication required.
Description
When I try to use runsc with rootless podman, I get an error:
$ podman run --interactive --tty --rm --runtime=runsc debian:testing
Error: OCI runtime error: runsc: creating container: systemd error: Interactive authentication required.
With runc, I don't get the error:
$ podman run --interactive --tty --rm --runtime=runc debian:testing
root@bbdf83f8f377:/#
I'm not sure if this is a bug in gvisor, podman, my setup, or something else. This seemed like a good place to start though since I couldn't find any reports of the same issue here or in podman.
Steps to reproduce
Run the command above as a non-root user.
runsc version
$ runsc -version
runsc version 0.0~20240729.0
spec: 1.2.0
docker version (if using docker)
It's not docker, but seems relevant:
$ podman --version
podman version 5.4.0
uname
Linux solaria 6.12.12-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.12.12-1 (2025-02-02) x86_64 GNU/Linux
kubectl (if using Kubernetes)
repo state (if built from source)
No response
runsc debug logs (if available)
Here's the bottom of the create log. Let me know if more would be useful.
D0311 19:53:25.557605 2642085 container.go:200] Create container, cid: b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b, rootDir: "/run/user/1000/runsc"
D0311 19:53:25.557658 2642085 container.go:1797] Configuring container with a new userns with identity user mappings into current userns
D0311 19:53:25.557717 2642085 container.go:1853] UID Mappings:
D0311 19:53:25.557726 2642085 container.go:1855] Container ID: 0, Host ID: 0, Range Length: 1
D0311 19:53:25.557732 2642085 container.go:1855] Container ID: 1, Host ID: 1, Range Length: 65536
D0311 19:53:25.557769 2642085 container.go:1853] GID Mappings:
D0311 19:53:25.557777 2642085 container.go:1855] Container ID: 0, Host ID: 0, Range Length: 1
D0311 19:53:25.557781 2642085 container.go:1855] Container ID: 1, Host ID: 1, Range Length: 65536
D0311 19:53:25.557816 2642085 container.go:262] Creating new sandbox for container, cid: b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b
D0311 19:53:25.560717 2642085 cgroup.go:428] New cgroup for pid: self, *cgroup.cgroupSystemd: &{cgroupV2:{Mountpoint:/sys/fs/cgroup Path:/user.slice/libpod-b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b.scope Controllers:[cpuset cpu io memory hugetlb pids rdma misc] Own:[]} Name:b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b Parent:user.slice ScopePrefix:libpod properties:[] dbusConn:0xc000424280}
D0311 19:53:25.560739 2642085 systemd.go:98] Installing systemd cgroup resource controller under user.slice
D0311 19:53:25.560936 2642085 container.go:1036] Created filestore file at "/home/dseomn/.local/share/containers/storage/overlay/9cf5032d7db9dc90e050a258377367773f27dc3b742414a5dd3afbba14876693/merged/.gvisor.filestore.b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b" for mount source "/home/dseomn/.local/share/containers/storage/overlay/9cf5032d7db9dc90e050a258377367773f27dc3b742414a5dd3afbba14876693/merged"
D0311 19:53:25.560962 2642085 systemd.go:154] Joining systemd cgroup libpod-b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b.scope
D0311 19:53:25.568684 2642085 cgroup_v2.go:177] Deleting cgroup "/sys/fs/cgroup/user.slice/libpod-b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b.scope"
D0311 19:53:25.568720 2642085 container.go:790] Destroy container, cid: b157a14c7618504c34a0a4439d2379a8cc079c3953a5dabb8a9512fd81ca5e8b
W0311 19:53:25.568855 2642085 util.go:64] FATAL ERROR: creating container: systemd error: Interactive authentication required.
W0311 19:53:25.568892 2642085 main.go:231] Failure to execute command, err: 1
i think it is duplicate of https://github.com/google/gvisor/issues/311
running as root should work here
sudo podman run --interactive --tty --rm --runtime=runsc debian:testing
I saw that, but I thought that was supposed to be fixed as of 8e4cb261486ad84bc5657b1cee0288018f693d01, making this a regression?
not really the command you shared requests systemd as cgroup manager.
the gvisor test script ignores cgroups at https://github.com/google/gvisor/blob/906fb319cc3afdd7ee8f6917a3a0636bcf7d1afd/test/podman/run.sh#L34
alternative to running as root, you can follow that test script run rootlessly by doing
$ mkdir /tmp/podman && cd "$_"
$ cat > runsc.podman <<EOF
#!/bin/bash
exec /tmp/runsc/runsc --ignore-cgroups "\$@"
EOF
$ chmod u+x runsc.podman
$ podman run --interactive --tty --rm --runtime /tmp/podman/runsc.podman debian:testing
root@c00655b0d8c6:/# dmesg
[ 0.000000] Starting gVisor...
[ 0.538988] Checking naughty and nice process list...
[ 0.549468] Creating cloned children...
[ 0.597381] Moving files to filing cabinet...
[ 0.778743] Constructing home...
[ 0.926395] Feeding the init monster...
[ 1.188440] Conjuring /dev/null black hole...
[ 1.597120] Digging up root...
[ 1.907546] Generating random numbers by fair dice roll...
[ 2.300496] Consulting tar man page...
[ 2.795365] Reading process obituaries...
[ 3.092411] Setting up VFS...
[ 3.351798] Setting up FUSE...
[ 3.490462] Ready!
Thank you, that fixes that error! Should --ignore-cgroups be the default for rootless containers, so that it just works out of the box?
Btw, there's an (I think) easier way to pass flags: podman run --runtime=runsc --runtime-flag=ignore-cgroups
I did get another error after adding that flag:
$ podman run --interactive --tty --rm --runtime=runsc --runtime-flag=ignore-cgroups debian:testing
starting container: setting up network: creating interfaces from net namespace "/proc/3194499/ns/net": cannot run with network enabled in root network namespace
Error: `/usr/bin/runsc --ignore-cgroups start d184db99d0d16877a9cb7bc63bb924c151dca61ce523b48c78ae04eaf8798414` failed: exit status 128
Adding --runtime-flag=network=none "fixed" it though. Should I file a separate feature request to get networking working in rootless containers? Or is that not feasible?
$ podman run --interactive --tty --rm --runtime=runsc --runtime-flag=ignore-cgroups --runtime-flag=network=none debian:testing
root@ef4b4733bc02:/#
I just actually read that dmesg output and literally laughed out loud. Thank you for that, whoever thought to put that humor in the dmesg logs!
Should
--ignore-cgroupsbe the default for rootless containers, so that it just works out of the box?
Rootless mode works for me without this flag, so there must be some difference between our systems that make your user unable to set up cgroups without superuser privileges. I'm not sure what that would be (try running under strace to see which syscall specifically gets rejected).
Adding
--runtime-flag=network=none"fixed" it though. Should I file a separate feature request to get networking working in rootless containers? Or is that not feasible?
gVisor's userspace network stack is known not to work in rootless containers; see #10359 for why. You can still use host network stack passthrough (--network=host), but this reduces the level of isolation gVisor gives you for network-related system calls.
I just actually read that dmesg output and literally laughed out loud. Thank you for that, whoever thought to put that humor in the dmesg logs!
Feel free to send a PR to add more to the rotation :)
Rootless mode works for me without this flag, so there must be some difference between our systems that make your user unable to set up cgroups without superuser privileges. I'm not sure what that would be (try running under strace to see which syscall specifically gets rejected).
If I'm reading the strace correctly, I think the issue was on the other side of the /run/dbus/system_bus_socket socket, not something that would show up in an strace of podman or runsc:
3388800 recvmsg(12, {msg_name={sa_family=AF_UNIX, sun_path="/run/dbus/system_bus_socket"}, msg_namelen=112 => 30, msg_iov=[{iov_base="$\0\0\0Interactive authentication r"..., iov_len=41}], msg_iovlen=1, msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) = 41
Some lines from sudo dbus-monitor --system that look interesting:
method call time=1741891164.116532 sender=:1.21680 -> destination=org.freedesktop.systemd1 serial=2 path=/org/freedesktop/systemd1; interface=org.freedesktop.systemd1.Manager; member=StartTransientUnit
string "libpod-903d0b959497f8ca5d1009fdd47d0b05dedf1c7c0332ddecccce1a4fdcec73bf.scope"
string "replace"
array [
struct {
string "Slice"
variant string "user.slice"
}
struct {
string "Description"
variant string "Secure container 903d0b959497f8ca5d1009fdd47d0b05dedf1c7c0332ddecccce1a4fdcec73bf"
}
struct {
string "PIDs"
variant array [
uint32 3392470
]
}
struct {
string "MemoryAccounting"
variant boolean true
}
struct {
string "CPUAccounting"
variant boolean true
}
struct {
string "TasksAccounting"
variant boolean true
}
struct {
string "IOAccounting"
variant boolean true
}
struct {
string "Delegate"
variant boolean true
}
struct {
string "DefaultDependencies"
variant boolean false
}
struct {
string "TasksMax"
variant uint64 2048
}
]
array [
]
...
method call time=1741891164.116687 sender=:1.21136 -> destination=org.freedesktop.PolicyKit1 serial=17062 path=/org/freedesktop/PolicyKit1/Authority; interface=org.freedesktop.PolicyKit1.Authority; member=CheckAuthorization
struct {
string "system-bus-name"
array [
dict entry(
string "name"
variant string ":1.21680"
)
]
}
string "org.freedesktop.systemd1.manage-units"
array [
dict entry(
string "unit"
string "libpod-903d0b959497f8ca5d1009fdd47d0b05dedf1c7c0332ddecccce1a4fdcec73bf.scope"
)
dict entry(
string "verb"
string "start"
)
dict entry(
string "polkit.message"
string "Authentication is required to start transient unit '$(unit)'."
)
dict entry(
string "polkit.gettext_domain"
string "systemd"
)
]
uint32 0
string ""
...
method return time=1741891164.122599 sender=:1.9 -> destination=:1.21136 serial=6287 reply_serial=17062
struct {
boolean false
boolean true
array [
dict entry(
string "polkit.gettext_domain"
string "systemd"
)
dict entry(
string "polkit.retains_authorization_after_challenge"
string "1"
)
dict entry(
string "polkit.message"
string "Authentication is required to start transient unit '$(unit)'."
)
dict entry(
string "unit"
string "libpod-903d0b959497f8ca5d1009fdd47d0b05dedf1c7c0332ddecccce1a4fdcec73bf.scope"
)
dict entry(
string "verb"
string "start"
)
]
}
error time=1741891164.122671 sender=:1.21136 -> destination=:1.21680 error_name=org.freedesktop.DBus.Error.InteractiveAuthorizationRequired reply_serial=2
string "Interactive authentication required."
I think /org/freedesktop/PolicyKit1/Authority is the thing that actually denied the permission? From https://www.freedesktop.org/software/polkit/docs/latest/eggdbus-interface-org.freedesktop.PolicyKit1.Authority.html#eggdbus-method-org.freedesktop.PolicyKit1.Authority.CheckAuthorization I think the first two boolean ... lines in the method return ... reply_serial=17062 section are is_authorized and is_challenge.
So I'm guessing that there's some difference in /usr/share/polkit-1/rules.d/ between your system where it works and mine where it doesn't. I'm on Debian testing, what about you?
gVisor's userspace network stack is known not to work in rootless containers; see https://github.com/google/gvisor/issues/10359 for why. You can still use host network stack passthrough (--network=host), but this reduces the level of isolation gVisor gives you for network-related system calls.
Thanks for the pointer!
Does this command show any matching rules on your system?
$ LC_ALL=C.UTF-8 sudo grep -r 'org\.freedesktop\.systemd1\.' /etc/polkit-1/rules.d /run/polkit-1/rules.d /usr/local/share/polkit-1/rules.d /usr/share/polkit-1/rules.d
grep: /run/polkit-1/rules.d: No such file or directory
grep: /usr/local/share/polkit-1/rules.d: No such file or directory
I'm on Debian testing, what about you?
Corporate Google workstation, based on Debian testing but I would not be surprised if there were many invasive tweaks into polkit.
$ LC_ALL=C.UTF-8 sudo grep -r 'org\.freedesktop\.systemd1\.' /etc/polkit-1/rules.d /run/polkit-1/rules.d /usr/local/share/polkit-1/rules.d /usr/share/polkit-1/rules.d
grep: /run/polkit-1/rules.d: No such file or directory
grep: /usr/local/share/polkit-1/rules.d: No such file or directory
I see no attempt to use dbus to talk to anything (strace -ff runsc --rootless do echo hi 2>&1 | grep -i dbus returns empty).
This suggests that runsc gives up on cgroup setup if it gets EACCESS:
$ strace -ff runsc --debug=true --alsologtostderr --rootless do echo hi 2>&1 | grep -i cgroup
[pid 2213169] write(2, "D0313 17:18:15.870963 2213169 c"..., 93D0313 17:18:15.870963 2213169 config.go:456] Config.IgnoreCgroups (--ignore-cgroups): false
[pid 2213169] write(2, "D0313 17:18:15.870988 2213169 c"..., 93D0313 17:18:15.870988 2213169 config.go:456] Config.SystemdCgroup (--systemd-cgroup): false
[pid 2213175] write(2, "D0313 17:18:15.921426 2213175 c"..., 93D0313 17:18:15.921426 2213175 config.go:456] Config.IgnoreCgroups (--ignore-cgroups): false
[pid 2213175] write(2, "D0313 17:18:15.921447 2213175 c"..., 93D0313 17:18:15.921447 2213175 config.go:456] Config.SystemdCgroup (--systemd-cgroup): false
"MEMORY_PRESSURE_WATCH=/sys/fs/cgroup/user.slice/user-210638.slice/[email protected]/session.slice/dbus.service/memory.pressure",
[pid 2213175] statfs("/sys/fs/cgroup", {f_type=CGROUP2_SUPER_MAGIC, f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={val=[0xaac7401f, 0x894c16af]}, f_namelen=255, f_frsize=4096, f_flags=ST_VALID|ST_NOSUID|ST_NODEV|ST_NOEXEC|ST_RELATIME}) = 0
[pid 2213175] openat(AT_FDCWD, "/sys/fs/cgroup/cgroup.controllers", O_RDONLY|O_CLOEXEC) = 11
[pid 2213175] write(2, "D0313 17:18:15.923539 2213175 c"..., 203D0313 17:18:15.923539 2213175 cgroup.go:428] New cgroup for pid: self, *cgroup.cgroupV2: &{Mountpoint:/sys/fs/cgroup Path:/runsc-060597 Controllers:[cpuset cpu io memory hugetlb pids rdma misc] Own:[]}
[pid 2213175] write(2, "D0313 17:18:15.923576 2213175 c"..., 102D0313 17:18:15.923576 2213175 cgroup_v2.go:132] Installing cgroup path "/sys/fs/cgroup/runsc-060597"
[pid 2213175] openat(AT_FDCWD, "/sys/fs/cgroup/cgroup.subtree_control", O_WRONLY|O_TRUNC|O_CLOEXEC) = -1 EACCES (Permission denied)
[pid 2213175] write(2, "D0313 17:18:15.923622 2213175 c"..., 95D0313 17:18:15.923622 2213175 cgroup_v2.go:177] Deleting cgroup "/sys/fs/cgroup/runsc-060597"
[pid 2213175] write(2, "W0313 17:18:15.923667 2213175 c"..., 160W0313 17:18:15.923667 2213175 container.go:1770] Skipping cgroup configuration in rootless mode: open /sys/fs/cgroup/cgroup.subtree_control: permission denied
[...]
This is coming from here: https://github.com/google/gvisor/blob/b01944883bfc3c0a0fa56565c197b3612401f9bc/runsc/container/container.go#L1824-L1834
So I think this code needs to broaden the way it identifies error such that polkit-based failures are treated in a similar manner, effectively turning off cgroup configuration in this case.
I see no attempt to use dbus to talk to anything (strace -ff runsc --rootless do echo hi 2>&1 | grep -i dbus returns empty).
I don't think this is relevant, but the strace line I shared before was from podman (following forks), I think.
This suggests that runsc gives up on cgroup setup if it gets EACCESS:
I see Config.SystemdCgroup (--systemd-cgroup): false in your logs, but in mine:
D0311 19:53:25.555206 2642085 config.go:436] Config.SystemdCgroup (--systemd-cgroup): true
I'm not sure about this, but I think it was systemd, not runsc, sending the dbus messages to polkit.
So I think this code needs to broaden the way it identifies error such that polkit-based failures are treated in a similar manner, effectively turning off cgroup configuration in this case.
If my understanding above is right, then I think runsc would see it as a failure from systemd, not polkit, but I'm not sure. Either way, it would be nice if it worked by default.
In case it's relevant to how podman calls runsc, here's part of the output of podman info on my system:
host:
...
cgroupControllers:
- cpu
- memory
- pids
cgroupManager: systemd
cgroupVersion: v2
I have the same issue. podman only works with --runtime-flag=ignore-cgroups unless running as root.
Let me know if you'd like any info from my system.
“Interactive authentication required” is a red herring. The actual problem is that runsc is talking to the system service manager when it should be talking to the user service manager. Unprivileged users should not have the ability to create new system-wide units, but the can (and do) have the ability to create per-user units!