rfcs
rfcs copied to clipboard
Sandbox build.rs and plugins
Currently, it's impossible to even typecheck a crate without running arbitrary code. Unless I'm mistaken, most plugins/build.rs
files just do codegen so they should work just fine inside a sandbox so it would be nice to provide some sort of safe/sandboxed way to generate code. See:
https://github.com/flycheck/flycheck/issues/894
:+1:, I wasn't aware this wasn't already done.
What kind of sandbox? What would it disallow? How would buildscripts that need extra permissions ask for them?
@sfackler I imagine two ways of doing it:
- Using a restricting virtual environment.
- Restricting the code to a subset of Rust.
The former is probably the easiest. However, the latter seems like the "correct" solution.
Personally, I'd go with 1. On Linux, I'd use seccomp and pass in a file descriptor to which generated code should be written.
Unfortunately, this won't work for projects that need to build c code.
I actually think (2) seems better, and I'm not sure I'd even go with Rust. It's good to have general-purpose build.rs for exceptional situations, but I bet most current ones fall into several cookie-cutter categories:
- Build a simple C library and link it in (e.g. a -sys crate)
- Syntex stuff
- Output some custom linker flags (same ones every time)
Would it be good to have a simple configuration format describing these? It could also be used to make decisions like @Stebalien says, i.e. Syntex munging needs to be done before typechecking but linker stuff does not. It also plays into the question of reproducible builds.
(trying to resurrect this thread) - without being able to sandbox build.rs, any usage of Rust made by me and my industry is hamstrung due to security concerns. We're effectively in the situation where we cannot use crates from crates.io without hand inspecting every crate/version to ensure that nothing malicious occurs.
While this is not much different from the existing C/C++ community (pull in a repo, use some arbitrary build system that could equally be malicious), the problem in Rust is exacerbated primarily because of how easy/safe it feels to pull in crates from crates.io.
As a strawperson solution:
- Could we define a 'safe' subset of what is allowed in a build.rs - EG. no file system / network / extern C calls.
- Could we allow crates to advertise via their TOML that they are a exposing a 'safe' build.rs / or even a 'safe' crate.
- A safe crate would only allow calls to 'safe' std functions - we could start by allowing 0 functions, and introduce a macro #[allow_in_safe] and gradually roll that out across the std as required. If a build.rs is 'safe' it can only call allow_in_safe functions, if a crate is 'safe' all the rust code can only call allow_in_safe functions.
- Allow a build.rs to write to a single output while in 'safe' mode. This would still allow nearly all codegen requests with Rust while still sandboxing what can be done.
- Only allow them to read files within the crates folder in a build.rs while in 'safe' mode.
- Lastly allow a TOML to specify that it can only use 'safe' build.rs or 'safe' crate, and make it an error if the crate is not 'safe'.
- Also don't stop build.rs as it is now, but make this a developer opt-in for those crates that are willing to make the change / those crate-users that are most concerned by security.
I may have missed something, but I (and others) have been trying to work out a nice solution to a very real problem with crates.io, and we obviously don't want to lose the awesome power of build.rs (I use it in my own Vulkan bindings!) but want to ensure people see our own crates as being secure.
I'm worried that this issue will be sat on until some utterly horrendous backdoor is discovered in a crate that malwares up a system or copies the contents elsewhere, and the backlash against our community will be harsher than would be warranted.
I really want a solution here 😄
Don't procedural macros also allow arbitrary code execution? So it would not be sufficient to sandbox build.rs
. I also think that most crates with a build.rs
will do file system access (particularly now that procedural macros exist) and so providing a "safe" language subset will not be particularly useful, whilst also leaving a very large attack surface.
These seem like the most promising options:
- Using existing sand-boxing technologies, maybe there's even a standardised interface for this, so that the sand-boxing technology can be swapped out easily?
- Using something like MIRI or a wasm interpreter to interpret code run at compile time, and limit access that way.
- Provide a way to disable compile-time execution entirely for most crates, and then have a white-list of "safe" crates. Perhaps there could be a community-maintained list that companies could use, so that the most common crates like serde, etc. can be relied on without additional work.
Most crates that use build.rs seem to read some files within the crate, and generate a single generated.rs - all contained by reads from the crate and writes only to the output build folder. This seems like it works in a sandbox model.
Having some way to say 'a crate is safe' is great! Have it be a badge on crates.io, and allow you to when specifying the dependencies in cargo.toml say 'use this crate at this version but only if it is safe'.
Honestly using wasm or something like that seems entirely doable - I don't really care whether my solution is the right one, only that we can start having some mechanism to provide guarantees here!
Perhaps an easier starting point would be to allow a cargo build --docker
or something, which would do the entire build in a container?
@mark-i-m there are already docker images for this - I use https://hub.docker.com/r/ekidd/rust-musl-builder/ at work, and it's pretty easy already, even without any cargo-specific functionality. I think if you're going to go that route it's going to be easier to just continue using docker directly.
@Diggsey That's fair, but my point is that if this is a barrier for more people, then perhaps this is an easy-to-implement-and-maintain sandboxing solution. In other words, the cargo integration makes it convenient to do the secure thing by default.
Maybe we could sandbox macros and build.rs using WebAssembly -it's intended to be sandboxed anyway, and forcing the build tooling to use it would bring development synergy for the emerging ecosystem of WASM-compilant libraries.
This would mean integrating a WebAssembly interpreter in cargo. Possible, but will require quite some work.
I'm unclear on what the precise blocking issues are supposed to be here. In particular, why is "do all your Rust builds in docker" not a solution? Don't you already have to do all your C/C++ builds inside something like that if your threat model includes third-party build scripts?
I do support finding common build.rs patterns and breaking them out into more declarative cargo features, but to me that seems more like an issue of ergonomics, modularity and static analyzability. It definitely can also lead to good heuristics for helping the less paranoid among us stay secure by default (though I have trouble imagining what kinds of attacks we could actually prevent this way). But for an industry that's nervous about using Rust at all because of this, I'm not seeing how making build.rs scripts less common or more restricted (I assume we can't kill them off completely) leads to a meaningful security benefit that they can't already get much more easily and thoroughly with an unrelated-to-Rust sandbox. That gives you total control over what you think should be "safe" for a build.
I'm unclear on what the precise blocking issues are supposed to be here. In particular, why is "do all your Rust builds in docker" not a solution? Don't you already have to do all your C/C++ builds inside something like that if your threat model includes third-party build scripts?
Right - we do not use 3rd party C/C++ libraries because of this reason. What will happen in practice is that any use of Rust that we do will have to have 3rd party extern crates banned because of security concerns. Our legal/security teams are not going to vet each version of each crate.
My hope is that if we at least had a 'safe' crate-badge on crates.io + a toml flag that banned the use of a build.rs - that is a much more palatable crate with which I could convince security + legal people to allow.
And as a further (future) step - some kind of sandboxing of the build.rs such that it cannot just read the entire filesystem and post its contents on the internet would be much welcomed.
There is new work on this topic: https://internals.rust-lang.org/t/pre-rfc-procmacros-implemented-in-wasm/10860
A platform-agnostic approach could be to transparently provide a fake/wrapper implementation of std::env
, std::process
, std::fs
, std::io
, std::net
, etc that is used when compiling/running code in macros and build.rs, instead of letting them do any real/unfiltered I/O
I'd prefer it if macros and build.rs could not:
- read arbitrary environment variables (e.g. secrets in CI) but only an explicit allow-list of variables (we might determine that there's a safe default set including CXXFLAGS, RUSTFLAGS, TMP, etc)
- read any file path outside the project directory (where Cargo.toml is) or TMP or /usr
- make network requests of any kind
If macros and build.rs need code from elsewhere, cargo
(something we trust) should be configured to download that instead
Exceptions to the "sandbox" should be configured in Cargo.toml, but should be ignored if defined within dependencies (just like Cargo.lock)
EDIT: allowing access to /usr is important :)
In practice it's pretty important that the build.rs of Rust -sys
crates are able to look in directories like /usr/lib
and invoke tools like pkg-config
and such. This is one of the reasons why it seems difficult to do something like what you suggest, it would break a lot of build.rs's in the wild, and ones that don't really have an alternative (and locating libraries like this is one of the main points of build scripts).
It is reasonable that those crates do that, but having them explicitly state in an allow-list in their Cargo.toml what they're doing seems like a good way to make clear what's going on.
Please no, I already have enough bug reports from new users trying to get rusqlite to work, adding another step like "okay also put libsqlite3-sys
in the bikeshed.allowlist
list would just complicate this further.
If it's opt-in then sure, but if it's this way by default it will be a huge additional headache for new users trying to use crates which depend on -sys
crates (which is very common!)
I can't speak for jokeyrhyme, but personally I was imagining that a crate would declare it's required allow list in its own Cargo.toml, the same way it declares its own license and if it needs panic=unwind or to always use wrapping math and other things like that. Then the dev puts all the info in one consistent place for users to look at (or check via automated tool) the same way they can check if a crate's license works for them.
I'm saying that if this is required for new users to use any -sys
crate, it will be an enormous stumbling block.
Anyway, it's hard to see how that wouldn't be a massive breaking change, since it would completely prevent this very common use case of build.rs
by sys crates to find their libraries.
In practice it's pretty important that the build.rs of Rust
-sys
crates are able to look in directories like/usr/lib
and invoke tools likepkg-config
and such
Bah, I had /usr in my mind when I started typing my suggestion, but somehow left it out, oops
Thanks for reminding me, updated: https://github.com/rust-lang/rfcs/issues/1515#issuecomment-1133230382
I'm sure there's some amount of work we can do here that doesn't break 90% of crates and is also significantly more secure than the status quo
I think for build.rs and proc-macro, we can at least set up a sandbox that prevents it from modifying anywhere other than the crate itself, or even limit it to modifying /target only.
Then we can prevent accessing to certain dir that makes absolutely no sense, such as /boot, /sys, /home directories except the current user unless they are in $PATH and thendir are at least searchable, same for /root unless it's run as root.
And it also makes sense to
- use process namespace combined with mounting procfs with hidpid=2 to prevent the build.rs or proc macro from accessing info of other processes.
- prevent access to /dev except dor /dev/zero, /dev/null, /dev/random and /dev/urandom
- drop all capabilities and set NO_NEW_PRIVS to prevent use of setuid/setgid programs
- Hide secrets such as $CARGO_HOME/credential, $HOME/.git-credential
I think it would be best to archive this by a new environment variables which can specify a command for running build.rs and proc-macro.
I've created a post here for my ideas.
Starting with OS-specific features like process namespaces could make it difficult to achieve the same baseline isolation across all the platforms that Rust supports
Given that build.rs interacts with the world via std
, an OS-agnostic approach could be to provide build.rs with a proxy version of std
that implements the desired isolation
OS-specific features (like process namespaces) could be used instead of the proxy std
as a performance optimisation (provided the same level of isolation is achieved)
Starting with OS-specific features like process namespaces could make it difficult to achieve the same baseline isolation across all the platforms that Rust supports
Yes, that's why I think it's better to let users decide how to sandbox it by invoking the binary they specified with path to build.rs
Given that build.rs interacts with the world via
std
, an OS-agnostic approach could be to provide build.rs with a proxy version ofstd
that implements the desired isolation
It's common for them invoke external command so this won't work very well in practice.
Also it could be work around by using libc or extern "C" directlt.
OS-specific features (like process namespaces) could be used instead of the proxy
std
as a performance optimisation (provided the same level of isolation is achieved)
By letting user specify the binary to do the sandboxing, we could even use gVisor or firecracker which creates a mini VM but setup like container.