common icon indicating copy to clipboard operation
common copied to clipboard

[seccomp] update defaults to current docker rules (Started as: block unshare by default)

Open martinetd opened this issue 1 year ago • 17 comments

Hi,

I've noticed that on my systems (fedora, debian, alpine) it's possible to get network admin privileges in a user namespace within a container:

$ podman run --rm -ti docker.io/alpine
/ # apk add iptables
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/main/x86_64/APKINDEX.tar.gz
fetch https://dl-cdn.alpinelinux.org/alpine/v3.19/community/x86_64/APKINDEX.tar.gz
(1/4) Installing libmnl (1.0.5-r2)
(2/4) Installing libnftnl (1.2.6-r0)
(3/4) Installing libxtables (1.8.10-r3)
(4/4) Installing iptables (1.8.10-r3)
Executing busybox-1.36.1-r15.trigger
OK: 10 MiB in 19 packages
/ # unshare -Urn
526a5598f862:/# iptables -S
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT

I'd have expected this to be blocked, and looking at the git history there was some attempt at making unshare only allowed for containers with CAP_SYS_ADMIN but for some reason it was also duplicated and allowed in the general case, and that got cleaned up "the wrong way" in bf297c191ea46d0c0b252a1549dc4cfe6733dfe0

I've checked the latest docker rules (running docker), and they block unshare properly by default, so it looks like a case of its the Right Thing to Do (it's not blocked for sys admin containers, as we originally had)

At this point I checked their seccomp rules and there are quite a few other changes -- I believe https://github.com/containers/common/blob/main/pkg/seccomp/default_linux.go was originally based on https://github.com/mody/moby/blob/main/profiles/seccomp/default_linux.go , but the docker one is quite more strict and has more syscalls allowed only when some caps are given.

I've started updating the file locally, but before I spend more time on this:

  • what's the policy with using docker things? In this case I'm looking at what they're doing and so far it all makes sense, the syntax is a bit different but functionally I'll probably end up very close to the docker's default seccomp. Both repos are under apache license so I don't think that'll cause much problem, but I'd like to confirm that first.
  • Do we want bug-for-bug compatibility (e.g. just copy all their rules), or only take what looks useful? I've started cherry-picking but it might frankly be faster to just sync.

Thanks!

martinetd avatar May 09 '24 01:05 martinetd