proot icon indicating copy to clipboard operation
proot copied to clipboard

Implement libseccomp for syscall filtering

Open oxr463 opened this issue 4 years ago • 2 comments

I propose we implement libseccomp and multi-threading simultaneously, (See: https://github.com/seccomp/libseccomp/issues/102).

Originally posted by @oxr463 in https://github.com/proot-me/proot/issues/111#issuecomment-554496966

See: https://github.com/proot-me/proot/issues/106, https://github.com/proot-me/proot/pull/130

oxr463 avatar Nov 25 '19 21:11 oxr463

Excerpts from an email with libseccomp maintainers.

Per Paul,

Hi Lucas,

It looks like proot is trying to ptrace another process, possibly outside it's sandbox, and is running into problems that you are attributing to seccomp? Further, it doesn't look like proot itself is using libseccomp but rather is creating its own seccomp filters. Without taking a deep dive into proot and its seccomp filters, it is hard to say what the exact problem may be. Tracing can be difficult to get right, especially when seccomp filters are involved. I'm not sure if this is your problem, but the issue that comes to mind first is the "-1" syscall; see the libseccomp documentation around SCMP_FLTATR_API_TSKIP for more information on that (manpage can be seen here -> https://github.com/seccomp/libseccomp/blob/master/doc/man/man3/seccomp_attr_set.3).

FWIW, it also appears that you are not handling a lot of ABI specific details correctly; for example the x32/x86-64 ABI detection needs special handling as well as the socket/ipc syscalls on platforms which support both the multiplexed and direct-wired versions.

My guess is it wouldn't be too difficult, maybe Tom has some experience here? As a bonus, if you adapted proot to use libseccomp, the ABI problems mentioned above would be taken care of by the library itself, you wouldn't need to worry about them in proot.

Per Tom,

I briefly looked through your PRoot code, and while more complex than the transition I helped work on, at initial glance it doesn't look too daunting to switch to libseccomp.

Paul brings up several good points of the advantages of using libseccomp as well. We (libseccomp) will take care of cBPF creation, ABI issues, and any syscall changes when a new kernel is released. Maintaining a custom filter throughout all of that can be challenging.

Per Paul,

you will just need to ensure that you have added the required architecture to the filter, look at the seccomp_arch_add(...) function/manpage for more information. As long as you add the architecture before you add the rules, adding a rule to the filter will add it to the architecture.

If you want to create different rule sets for different architectures you can create multiple filters, one for each architecture, and merge them later using the seccomp_merge(...) function. There are some limitations, e.g. same filter attributes, same endianness, etc., but in general it works just fine.

In general whilelists tend to be more restrictive (and potentially "safer" as a result), but they are harder to maintain. I believe most general purpose container and sandbox tools currently use a blacklist approach.

oxr463 avatar Jan 17 '20 13:01 oxr463

See: https://github.com/proot-me/proot/search?q=seccomp.h&unscoped_q=seccomp.h

oxr463 avatar Jun 15 '20 19:06 oxr463