bubblewrap
bubblewrap copied to clipboard
bwrap refuses to run with ambient capabilities
When running in podman rootless container launched with --userns=keep-id alone, bwrap refuses to run with following message:
bwrap: Unexpected capabilities but not setuid, old file caps config?
However, adding --user 1000:1000 (or any other uid:gid) makes bwrap work as expected.
This is an instance of a more general issue, that nesting containers (container inside container) doesn't always work.
What distribution are you using, inside and outside the podman container?
What command-line did you use to launch the podman container?
What bwrap command-line are you using to reproduce this bug? (The simpler the better.)
What version of bwrap are you using? Is your bwrap executable setuid root or not?
What capabilities and credentials do you have inside the podman rootless container? In recent distributions you can find out by running setpriv --dump.
This is an instance of a more general issue, that nesting containers (container inside container) doesn't always work.
Yeah, but what makes me curious is that, as far as I understand, --userns=keep-id and --userns=keep-id --user 1000:1000 should produce the same namespace, given host users's ID is 1000.
What distribution are you using, inside and outside the podman container?
Fedora 32 outside, freedesktop-sdk inside. Also tested with ubuntu 20.04 image with bwrap from repositories.
What command-line did you use to launch the podman container?
podman run --rm --tmpfs /tmp \
--security-opt label=type:spc_t \
--security-opt seccomp=flatpak-docker-seccomp.json \
-v /proc:/host/proc \
--userns=keep-id\
-it freedesktopsdk/flatpak:19.08-x86_64 \
Also reproducible with --privileged option instead of --security-opt options and /proc mount, and with seccomp=unconfined.
What
bwrapcommand-line are you using to reproduce this bug? (The simpler the better.)
/usr/libexec/flatpak-bwrap --unshare-all --ro-bind / / --proc /proc true
What version of
bwrapare you using? Is yourbwrapexecutable setuid root or not?
One that comes bundled with flatpak 1.4.3, or 0.4.0 in ubuntu image, both not setuid.
What capabilities and credentials do you have inside the podman rootless container? In recent distributions you can find out by running
setpriv --dump.
With --userns=keep-id alone:
uid: 1000
euid: 1000
gid: 1000
egid: 1000
Supplementary groups: [none]
no_new_privs: 0
Inheritable capabilities: chown,dac_override,fowner,fsetid,kill,setgid,setuid,setpcap,net_bind_service,net_raw,sys_chroot,mknod,audit_write,setfcap
Ambient capabilities: chown,dac_override,fowner,fsetid,kill,setgid,setuid,setpcap,net_bind_service,net_raw,sys_chroot,mknod,audit_write,setfcap
Capability bounding set: chown,dac_override,fowner,fsetid,kill,setgid,setuid,setpcap,net_bind_service,net_raw,sys_chroot,mknod,audit_write,setfcap
Securebits: [none]
Parent death signal: [none]
SELinux label: system_u:system_r:spc_t:s0:c581,c747
With --userns=keep-id --user 1000:1000:
uid: 1000
euid: 1000
gid: 1000
egid: 1000
Supplementary groups: [none]
no_new_privs: 0
Inheritable capabilities: [none]
Ambient capabilities: [none]
Capability bounding set: chown,dac_override,fowner,fsetid,kill,setgid,setuid,setpcap,net_bind_service,net_raw,sys_chroot,mknod,audit_write,setfcap
Securebits: [none]
Parent death signal: [none]
SELinux label: system_u:system_r:spc_t:s0:c611,c701
The difference appears to be that without --user 1000:1000, podman gives you inheritable and ambient capabilities.
bwrap has to be a bit paranoid about being invoked with unexpected capabilities, because the most likely reason to have them is that it's setuid root or has file capabilities (as in setcap(8)); and if it has been installed like that, then the sysadmin is trusting bwrap to act as a security boundary between unprivileged users and excess (privileged) capabilities.
For setuid, bwrap can check the real and effective uid to tell whether it was setuid root, but for file capabilities I'm not sure whether it can.
After digging into it once more, it looks like the issue comes from ambient capabilities.
Podman with --user argument removes them, allowing bwrap to run. If I remove ambient capabilities by other means, e.g. with setpriv --ambient-caps '-all' inside container, bwrap runs normally, too.
+1 on this issue. Could you make bwrap just issue a warning, but continue running?
I'm using a trick with ambient caps due to the fact that NFS4 doesn't support caps set on executable files. Now I'm between a rock and a hard place, because I can either have Steam games using Steam runtime working, or I can have the best performance in SteamVR.
Could you make bwrap just issue a warning, but continue running?
Not really, because, as I said earlier:
bwrap has to be a bit paranoid about being invoked with unexpected capabilities, because the most likely reason to have them is that it's setuid root or has file capabilities (as in setcap(8)); and if it has been installed like that, then the sysadmin is trusting bwrap to act as a security boundary between unprivileged users and excess (privileged) capabilities.
If bwrap is being trusted to act as a security boundary, then it really ought to "fail closed".
I ran into this issue. To illustrate it, let's use a simpler reproduction than the podman example.
Usually you can run bwrap inside of bwrap:
bwrap --dev-bind / / -- bwrap --dev-bind / / -- echo hello
hello
This command creates a bwrap environment in which we run bwrap again to run echo hello.
Let's change the command to add a capability to the first environment. Now the second environment fails:
bwrap --cap-add CAP_SYS_NICE --dev-bind / / -- bwrap --dev-bind / / -- echo hello
bwrap: Unexpected capabilities but not setuid, old file caps config?
What @smcv is saying here
the sysadmin is trusting bwrap to act as a security boundary between unprivileged users and excess (privileged) capabilities
is that (?) usually when you run bwrap without --cap-add you expect the resulting environment to not have any capabilities. However, in this example the expectation would be violated so bwrap refuses to run to be safe. This is reasonable.
Note that we can workaround this by dropping the capability again. Using capsh:
bwrap --cap-add CAP_SYS_NICE --dev-bind / / -- capsh --caps="" -- -c "bwrap --dev-bind / / -- echo hello"
hello
Based on this, I suggest bwrap automatically drop all capabilities that have not been specified with ---cap-add, instead of the current behavior which is erroring out. This has no security downside as we still respect the user's expectation that bwrap environments do not silently gain capabilities.
If you do want to forward the capability to the next bwrap environment then this should work too.
With my proposed changes my examples would look like this:
bwrap --cap-add CAP_SYS_NICE --dev-bind / / -- bwrap --dev-bind / / -- echo hello
hello # second environment has no capabilities
bwrap --cap-add CAP_SYS_NICE --dev-bind / / -- bwrap --cap-add CAP_SYS_NICE --dev-bind / / -- echo hello
hello # second environment has CAP_SYS_NICE capability
I am willing to try implementing these changes if a maintainer would accept them.
Systemd 254 adds cap_wake_alarm by default in pam_systemd https://github.com/systemd/systemd/blob/0e2f18eedd6b9be32b1c1122dcd2c30319074c7f/NEWS#L703 and now it's very easy to run into this error, e.g., phosh.service uses PAMName=login and thus gets this capability and then fails to start any .desktop files that depend on brwap, e.g., either Epiphany which uses bwrap or any Flatpak app of course.
Edit: Filed a patch for Phosh: https://gitlab.gnome.org/World/Phosh/phosh/-/merge_requests/1351
For setuid, bwrap can check the real and effective uid to tell whether it was setuid root, but for file capabilities I'm not sure whether it can.
Can't you check it using the getppid() syscall and /proc/<pid>/status's CapEff entry?
Can't you check it using the
getppid()syscall and/proc/<pid>/status'sCapEffentry?
This would race for pid (pid-reuse) and for CapEff which could have changed in the meantime.