sysbox
sysbox copied to clipboard
No such file or directory - unprivileged_userns_clone
Am seeing the following error when attempting to install Sysbox on an RKE2 worker node.
sysctl: cannot stat /proc/sys/kernel/unprivileged_userns_clone: No such file or directory
Using the latest version of all system components, including the latest "longterm" version of the Linux kernel as per https://kernel.org.
os=Ubuntu 22.04.4 LTS, kernel=6.6.17-060617-generic, sysbox=0.6.3 CE, rke2=v1.28.6+rke2r1, rancher=v2.8.2
Here is a summary of results when using various kernel versions.
- 6.8-rc4 (mainline) - fails
- 6.7.5 (stable) - fails
- 6.6.17 (longterm) - fails
- 6.5.18 (from the mainline) - succeeds
- 6.5.18 (Ubuntu LTS HWE) - succeeds
Here are the Sysbox logs.
Detected Kubernetes version v1.28 2024-02-18T14:06:20.436036364+10:00 Adding K8s taint "sysbox-runtime=not-running:NoSchedule" to node ... 2024-02-18T14:06:20.502085520+10:00 node/w6 modified 2024-02-18T14:06:20.506327963+10:00 Adding K8s label "crio-runtime=installing" to node ... node/w6 not labeled 2024-02-18T14:06:20.571532866+10:00 Deploying CRI-O installer agent on the host (v1.28) ... 2024-02-18T14:06:21.002092200+10:00 Running CRI-O installer agent on the host (may take several seconds) ... 2024-02-18T14:06:23.395783002+10:00 Removing CRI-O installer agent from the host ... 2024-02-18T14:06:23.908565022+10:00 Configuring CRI-O ... 2024-02-18T14:06:25.114476735+10:00 Adding K8s label "sysbox-runtime=installing" to node ... 2024-02-18T14:06:25.175698518+10:00 node/w6 not labeled 2024-02-18T14:06:25.194719353+10:00 Installing Sysbox dependencies on host ... 2024-02-18T14:06:25.205312310+10:00 Copying shiftfs sources to host ... 2024-02-18T14:06:25.206274620+10:00 Kernel version 6.6, which is above the max required for shiftfs (6.2); skipping shiftfs installation. 2024-02-18T14:06:25.206287839+10:00 Deploying Sysbox installer helper on the host ... 2024-02-18T14:06:25.499708961+10:00 Running Sysbox installer helper on the host (may take several seconds) ... 2024-02-18T14:06:28.052809028+10:00 Stopping the Sysbox installer helper on the host ... 2024-02-18T14:06:28.404940170+10:00 Removing Sysbox installer helper from the host ... 2024-02-18T14:06:28.704056754+10:00 Installing Sysbox on host ... Configuring host sysctls ... 2024-02-18T14:06:29.921906091+10:00 sysctl: cannot stat /proc/sys/kernel/unprivileged_userns_clone: No such file or directory 2024-02-18T14:06:29.922058926+10:00 fs.inotify.max_queued_events = 1048576 2024-02-18T14:06:29.922064366+10:00 fs.inotify.max_user_watches = 1048576 fs.inotify.max_user_instances = 1048576 2024-02-18T14:06:29.922069990+10:00 kernel.keys.maxkeys = 20000 2024-02-18T14:06:29.922072582+10:00 kernel.keys.maxbytes = 1400000 2024-02-18T14:06:29.922075222+10:00 kernel.pid_max = 4194304
Also noticed the following vulnerability can be mitigated by disabling unprivileged user namespaces.
- https://coder.com/blog/statement-on-the-recent-cve-2022-0185-vulnerability
- https://ubuntu.com/security/CVE-2022-0185
sysctl -w kernel.unprivileged_userns_clone=0
Am assuming the Sysbox error encountered above relates to attempting to disable unprivileged user namespaces with sysctl
.
Found the following reference in the Sysbox user guide.
- https://github.com/nestybox/sysbox/blob/release_v0.6.3/docs/user-guide/troubleshoot.md#unprivileged-user-namespace-creation-error
Along with the following fix.
sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone"
This appears to allow the vulnerability identified above (as opposed to preventing it).
Attempting to execute the fix produces the following error.
sh: 1: cannot create /proc/sys/kernel/unprivileged_userns_clone: Directory nonexistent
Performing the following command shows that the /proc/sys/kernel
directory exists.
ls -l /proc/sys/kernel
Hi @matthewparkinsondes,
Thanks for reporting the issue.
Context
Normally, Sysbox does not require the /proc/sys/kernel/unprivileged_userns_clone
sysctl to be set, except in some scenarios (see below). That's because the /proc/sys/kernel/unprivileged_userns_clone
sysctl allows unprivileged users to create user-namespaces. Since Sysbox runs as a privileged process on the host, the sysctl does not apply to Sysbox per-se.
Having said that, inside a Sysbox container, all processes are unprivileged at host level since they run inside a user-namespace. If a process inside the Sysbox container wished to create a user-namespace (e.g., running Docker with userns-remap enabled for example), then that would fail unless /proc/sys/kernel/unprivileged_userns_clone
is set at host level.
This is why the Sysbox K8s installer tries to set /proc/sys/kernel/unprivileged_userns_clone
to 1. And sysbox-runc also checks for this when creating a container.
Though we could relax this so that Sysbox does not require it, in general having /proc/sys/kernel/unprivileged_userns_clone
set to 1 is not a security problem, except when there are Linux kernel user-namespace vulnerabilities / CVEs (such as CVE 2022-0185) that allow an unprivileged process to escalate to root by creating a user-namespace.
The issue you reported
Here is a summary of results when using various kernel versions.
6.8-rc4 (mainline) - fails 6.7.5 (stable) - fails 6.6.17 (longterm) - fails 6.5.18 (from the mainline) - succeeds 6.5.18 (Ubuntu LTS HWE) - succeeds
This is interesting; normally Ubuntu/Debian distros have the /proc/sys/kernel/unprivileged_userns_clone
sysctl, so something is wrong in 6.6.17 and above.
For the failing cases, what do you see under /proc/sys/kernel
?
Hi @ctalledo,
Thanks, here are the contents of /proc/sys/kernel
for the 6.5.21 passing case.
6.5.21
The 6.6.17 failing case removes the following from /proc/sys/kernel
.
- apparmor_restrict_unprivileged_io_uring
- apparmor_restrict_unprivileged_unconfined
- apparmor_restrict_unprivileged_userns
- apparmor_restrict_unprivileged_userns_complain
- apparmor_restrict_unprivileged_userns_force
- unprivileged_userns_clone
6.6.17
The /proc/sys/kernel
directory in the 6.7.6 failing case adds.
- apparmor_restrict_unprivileged_unconfined
And it removes.
- sched_child_runs_first
6.7.6
The /proc/sys/kernel
directory in the 6.8-rc6 failing case is identical to 6.7.6
6.8-rc6
Thanks @matthewparkinsondes for the detailed response. Let me double check on my side and if confirmed, looks like we will need to change the Sysbox check for /proc/sys/kernel/unprivileged_userns_clone
in the newer kernels.
Also, in the newer kernels, do you see /proc/sys/user/max_user_namespaces
?
Thanks!
thanks @ctalledo.
Also, in the newer kernels, do you see
/proc/sys/user/max_user_namespaces
?
yes ... this is present in all of the newer kernels ... 6.6.21, 6.7.9 and 6.8-rc7
yes ... this is present in all of the newer kernels ... 6.6.21, 6.7.9 and 6.8-rc7
Ah, that likely means that starting with those newer kernels, Ubuntu distros are behaving as Fedora-based distros do, which use /proc/sys/user/max_user_namespaces
as a way to enable unprivileged user-ns. If true, this is good news as it creates more consistency across the distros, though we need to update Sysbox to detect the kernel version and then check for the right file.