Proposal: use pre-generated BPF filter
I remember this was already discussed somewhere some years ago, but I couldn't find it (perhaps on a different project?), so I am opening it again here to continue a discussion.
The current way of setting Seccomp rules in the OCI config file is quite inflexible and modeled around the libseccomp APIs.
I propose adding another way to pass the seccomp profile using the final BPF program to load, allowing for more adaptable and dynamic security configurations that can be generated outside of the OCI runtime itself.
crun already supports it through a custom annotation run.oci.seccomp_bpf_data specifying the BPF data to load.
👍
+1, but I have a few questions:
- How do we handle the existing seccomp rules field with
run.oci.seccomp_bpf_data? - What format is in this field? e.g., seccomp rules(DSL)
- Who is responsible for the seccomp BPF of the dependent parts of the architecture?
I agree this would be useful.
(I think run.oci.seccomp_bpf_data is not a good name for a crun-specific annotation, but let's skip that for now.)
I think just allowing a binary blob (probably base64-encoded I guess) to pass to seccomp(2) would be the simplest solution, and allowing us to slowly move away from libseccomp-isms would be nice (though there are a lot of other downsides to having to hand-roll everything). If a user wants to have their own custom BPF filter, they can handle checking the architecture in the BPF filter they write.
My main concerns would be:
- How do we nicely have this coexist with the existing seccomp config. Obviously you can only specify one, but should it be a separate field in
linuxor something likelinux.seccomp.rawFilter(the latter is more "idiomatic" but would make config validation more annoying). - We almost certainly need to allow runtimes to modify user-provided filters to some extent (such as adding runc's
-ENOSYSstub to work around compatibility issues -- though the-ENOSYSstub wouldn't work with raw filters as currently designed...). We will need to make sure the wording allows this, but also doesn't allow a runtime to silently ignore the setting and set a custom filter... - Since an
-ENOSYSstub would not be easy to implement for anything but the most basic of filters, we will probably need to have some kind of text to explain that users will need to make sure they handle compatibility in a reasonable way. I'm a little concerned that a popular tool will generate configurations that we (as runtimes) cannot monkey-patch to work around...