rfcs icon indicating copy to clipboard operation
rfcs copied to clipboard

Sandbox build.rs and plugins

Open Stebalien opened this issue 8 years ago • 24 comments

Currently, it's impossible to even typecheck a crate without running arbitrary code. Unless I'm mistaken, most plugins/build.rs files just do codegen so they should work just fine inside a sandbox so it would be nice to provide some sort of safe/sandboxed way to generate code. See:

https://github.com/flycheck/flycheck/issues/894

Stebalien avatar Feb 27 '16 21:02 Stebalien

:+1:, I wasn't aware this wasn't already done.

ticki avatar Feb 27 '16 21:02 ticki

What kind of sandbox? What would it disallow? How would buildscripts that need extra permissions ask for them?

sfackler avatar Feb 27 '16 21:02 sfackler

@sfackler I imagine two ways of doing it:

  1. Using a restricting virtual environment.
  2. Restricting the code to a subset of Rust.

The former is probably the easiest. However, the latter seems like the "correct" solution.

ticki avatar Feb 27 '16 22:02 ticki

Personally, I'd go with 1. On Linux, I'd use seccomp and pass in a file descriptor to which generated code should be written.

Stebalien avatar Feb 28 '16 00:02 Stebalien

Unfortunately, this won't work for projects that need to build c code.

Stebalien avatar Feb 28 '16 00:02 Stebalien

I actually think (2) seems better, and I'm not sure I'd even go with Rust. It's good to have general-purpose build.rs for exceptional situations, but I bet most current ones fall into several cookie-cutter categories:

  1. Build a simple C library and link it in (e.g. a -sys crate)
  2. Syntex stuff
  3. Output some custom linker flags (same ones every time)

Would it be good to have a simple configuration format describing these? It could also be used to make decisions like @Stebalien says, i.e. Syntex munging needs to be done before typechecking but linker stuff does not. It also plays into the question of reproducible builds.

durka avatar Feb 28 '16 16:02 durka

(trying to resurrect this thread) - without being able to sandbox build.rs, any usage of Rust made by me and my industry is hamstrung due to security concerns. We're effectively in the situation where we cannot use crates from crates.io without hand inspecting every crate/version to ensure that nothing malicious occurs.

While this is not much different from the existing C/C++ community (pull in a repo, use some arbitrary build system that could equally be malicious), the problem in Rust is exacerbated primarily because of how easy/safe it feels to pull in crates from crates.io.

As a strawperson solution:

  • Could we define a 'safe' subset of what is allowed in a build.rs - EG. no file system / network / extern C calls.
  • Could we allow crates to advertise via their TOML that they are a exposing a 'safe' build.rs / or even a 'safe' crate.
  • A safe crate would only allow calls to 'safe' std functions - we could start by allowing 0 functions, and introduce a macro #[allow_in_safe] and gradually roll that out across the std as required. If a build.rs is 'safe' it can only call allow_in_safe functions, if a crate is 'safe' all the rust code can only call allow_in_safe functions.
  • Allow a build.rs to write to a single output while in 'safe' mode. This would still allow nearly all codegen requests with Rust while still sandboxing what can be done.
  • Only allow them to read files within the crates folder in a build.rs while in 'safe' mode.
  • Lastly allow a TOML to specify that it can only use 'safe' build.rs or 'safe' crate, and make it an error if the crate is not 'safe'.
  • Also don't stop build.rs as it is now, but make this a developer opt-in for those crates that are willing to make the change / those crate-users that are most concerned by security.

I may have missed something, but I (and others) have been trying to work out a nice solution to a very real problem with crates.io, and we obviously don't want to lose the awesome power of build.rs (I use it in my own Vulkan bindings!) but want to ensure people see our own crates as being secure.

I'm worried that this issue will be sat on until some utterly horrendous backdoor is discovered in a crate that malwares up a system or copies the contents elsewhere, and the backlash against our community will be harsher than would be warranted.

I really want a solution here 😄

sheredom avatar Oct 18 '18 15:10 sheredom

Don't procedural macros also allow arbitrary code execution? So it would not be sufficient to sandbox build.rs. I also think that most crates with a build.rs will do file system access (particularly now that procedural macros exist) and so providing a "safe" language subset will not be particularly useful, whilst also leaving a very large attack surface.

These seem like the most promising options:

  • Using existing sand-boxing technologies, maybe there's even a standardised interface for this, so that the sand-boxing technology can be swapped out easily?
  • Using something like MIRI or a wasm interpreter to interpret code run at compile time, and limit access that way.
  • Provide a way to disable compile-time execution entirely for most crates, and then have a white-list of "safe" crates. Perhaps there could be a community-maintained list that companies could use, so that the most common crates like serde, etc. can be relied on without additional work.

Diggsey avatar Oct 18 '18 16:10 Diggsey

Most crates that use build.rs seem to read some files within the crate, and generate a single generated.rs - all contained by reads from the crate and writes only to the output build folder. This seems like it works in a sandbox model.

Having some way to say 'a crate is safe' is great! Have it be a badge on crates.io, and allow you to when specifying the dependencies in cargo.toml say 'use this crate at this version but only if it is safe'.

Honestly using wasm or something like that seems entirely doable - I don't really care whether my solution is the right one, only that we can start having some mechanism to provide guarantees here!

sheredom avatar Oct 18 '18 16:10 sheredom

Perhaps an easier starting point would be to allow a cargo build --docker or something, which would do the entire build in a container?

mark-i-m avatar Oct 18 '18 21:10 mark-i-m

@mark-i-m there are already docker images for this - I use https://hub.docker.com/r/ekidd/rust-musl-builder/ at work, and it's pretty easy already, even without any cargo-specific functionality. I think if you're going to go that route it's going to be easier to just continue using docker directly.

Diggsey avatar Oct 18 '18 21:10 Diggsey

@Diggsey That's fair, but my point is that if this is a barrier for more people, then perhaps this is an easy-to-implement-and-maintain sandboxing solution. In other words, the cargo integration makes it convenient to do the secure thing by default.

mark-i-m avatar Oct 18 '18 22:10 mark-i-m

Maybe we could sandbox macros and build.rs using WebAssembly -it's intended to be sandboxed anyway, and forcing the build tooling to use it would bring development synergy for the emerging ecosystem of WASM-compilant libraries.

golddranks avatar Oct 19 '18 05:10 golddranks

This would mean integrating a WebAssembly interpreter in cargo. Possible, but will require quite some work.

Ekleog avatar Oct 19 '18 10:10 Ekleog

I'm unclear on what the precise blocking issues are supposed to be here. In particular, why is "do all your Rust builds in docker" not a solution? Don't you already have to do all your C/C++ builds inside something like that if your threat model includes third-party build scripts?

I do support finding common build.rs patterns and breaking them out into more declarative cargo features, but to me that seems more like an issue of ergonomics, modularity and static analyzability. It definitely can also lead to good heuristics for helping the less paranoid among us stay secure by default (though I have trouble imagining what kinds of attacks we could actually prevent this way). But for an industry that's nervous about using Rust at all because of this, I'm not seeing how making build.rs scripts less common or more restricted (I assume we can't kill them off completely) leads to a meaningful security benefit that they can't already get much more easily and thoroughly with an unrelated-to-Rust sandbox. That gives you total control over what you think should be "safe" for a build.

Ixrec avatar Oct 19 '18 10:10 Ixrec

I'm unclear on what the precise blocking issues are supposed to be here. In particular, why is "do all your Rust builds in docker" not a solution? Don't you already have to do all your C/C++ builds inside something like that if your threat model includes third-party build scripts?

Right - we do not use 3rd party C/C++ libraries because of this reason. What will happen in practice is that any use of Rust that we do will have to have 3rd party extern crates banned because of security concerns. Our legal/security teams are not going to vet each version of each crate.

My hope is that if we at least had a 'safe' crate-badge on crates.io + a toml flag that banned the use of a build.rs - that is a much more palatable crate with which I could convince security + legal people to allow.

And as a further (future) step - some kind of sandboxing of the build.rs such that it cannot just read the entire filesystem and post its contents on the internet would be much welcomed.

sheredom avatar Nov 02 '18 16:11 sheredom

There is new work on this topic: https://internals.rust-lang.org/t/pre-rfc-procmacros-implemented-in-wasm/10860

est31 avatar Sep 03 '19 02:09 est31

A platform-agnostic approach could be to transparently provide a fake/wrapper implementation of std::env, std::process, std::fs, std::io, std::net, etc that is used when compiling/running code in macros and build.rs, instead of letting them do any real/unfiltered I/O

I'd prefer it if macros and build.rs could not:

  • read arbitrary environment variables (e.g. secrets in CI) but only an explicit allow-list of variables (we might determine that there's a safe default set including CXXFLAGS, RUSTFLAGS, TMP, etc)
  • read any file path outside the project directory (where Cargo.toml is) or TMP or /usr
  • make network requests of any kind

If macros and build.rs need code from elsewhere, cargo (something we trust) should be configured to download that instead

Exceptions to the "sandbox" should be configured in Cargo.toml, but should be ignored if defined within dependencies (just like Cargo.lock)

EDIT: allowing access to /usr is important :)

jokeyrhyme avatar May 20 '22 19:05 jokeyrhyme

In practice it's pretty important that the build.rs of Rust -sys crates are able to look in directories like /usr/lib and invoke tools like pkg-config and such. This is one of the reasons why it seems difficult to do something like what you suggest, it would break a lot of build.rs's in the wild, and ones that don't really have an alternative (and locating libraries like this is one of the main points of build scripts).

thomcc avatar May 20 '22 19:05 thomcc

It is reasonable that those crates do that, but having them explicitly state in an allow-list in their Cargo.toml what they're doing seems like a good way to make clear what's going on.

Lokathor avatar May 20 '22 19:05 Lokathor

Please no, I already have enough bug reports from new users trying to get rusqlite to work, adding another step like "okay also put libsqlite3-sys in the bikeshed.allowlist list would just complicate this further.

If it's opt-in then sure, but if it's this way by default it will be a huge additional headache for new users trying to use crates which depend on -sys crates (which is very common!)

thomcc avatar May 20 '22 19:05 thomcc

I can't speak for jokeyrhyme, but personally I was imagining that a crate would declare it's required allow list in its own Cargo.toml, the same way it declares its own license and if it needs panic=unwind or to always use wrapping math and other things like that. Then the dev puts all the info in one consistent place for users to look at (or check via automated tool) the same way they can check if a crate's license works for them.

Lokathor avatar May 20 '22 19:05 Lokathor

I'm saying that if this is required for new users to use any -sys crate, it will be an enormous stumbling block.

Anyway, it's hard to see how that wouldn't be a massive breaking change, since it would completely prevent this very common use case of build.rs by sys crates to find their libraries.

thomcc avatar May 20 '22 19:05 thomcc

In practice it's pretty important that the build.rs of Rust -sys crates are able to look in directories like /usr/lib and invoke tools like pkg-config and such

Bah, I had /usr in my mind when I started typing my suggestion, but somehow left it out, oops

Thanks for reminding me, updated: https://github.com/rust-lang/rfcs/issues/1515#issuecomment-1133230382

I'm sure there's some amount of work we can do here that doesn't break 90% of crates and is also significantly more secure than the status quo

jokeyrhyme avatar May 20 '22 19:05 jokeyrhyme

I think for build.rs and proc-macro, we can at least set up a sandbox that prevents it from modifying anywhere other than the crate itself, or even limit it to modifying /target only.

Then we can prevent accessing to certain dir that makes absolutely no sense, such as /boot, /sys, /home directories except the current user unless they are in $PATH and thendir are at least searchable, same for /root unless it's run as root.

NobodyXu avatar Jan 19 '23 15:01 NobodyXu

And it also makes sense to

  • use process namespace combined with mounting procfs with hidpid=2 to prevent the build.rs or proc macro from accessing info of other processes.
  • prevent access to /dev except dor /dev/zero, /dev/null, /dev/random and /dev/urandom
  • drop all capabilities and set NO_NEW_PRIVS to prevent use of setuid/setgid programs
  • Hide secrets such as $CARGO_HOME/credential, $HOME/.git-credential

I think it would be best to archive this by a new environment variables which can specify a command for running build.rs and proc-macro.

NobodyXu avatar Jan 20 '23 05:01 NobodyXu

I've created a post here for my ideas.

NobodyXu avatar Jan 20 '23 06:01 NobodyXu

Starting with OS-specific features like process namespaces could make it difficult to achieve the same baseline isolation across all the platforms that Rust supports

Given that build.rs interacts with the world via std, an OS-agnostic approach could be to provide build.rs with a proxy version of std that implements the desired isolation

OS-specific features (like process namespaces) could be used instead of the proxy std as a performance optimisation (provided the same level of isolation is achieved)

jokeyrhyme avatar Jan 20 '23 06:01 jokeyrhyme

Starting with OS-specific features like process namespaces could make it difficult to achieve the same baseline isolation across all the platforms that Rust supports

Yes, that's why I think it's better to let users decide how to sandbox it by invoking the binary they specified with path to build.rs

Given that build.rs interacts with the world via std, an OS-agnostic approach could be to provide build.rs with a proxy version of std that implements the desired isolation

It's common for them invoke external command so this won't work very well in practice.

Also it could be work around by using libc or extern "C" directlt.

OS-specific features (like process namespaces) could be used instead of the proxy std as a performance optimisation (provided the same level of isolation is achieved)

By letting user specify the binary to do the sandboxing, we could even use gVisor or firecracker which creates a mini VM but setup like container.

NobodyXu avatar Jan 20 '23 07:01 NobodyXu