miri
miri copied to clipboard
Finer grained isolation control
Right now miri only support isolation or no isolation. Would it be possible to have fine grained isolation/access control?
For example, allowing access to /dev/null but isolate everything else.
Possible, in theory, sure. But it's also a bottomless pit of complexity that I am not sure we want to get into. Even just making it fine-grained by API would already be quite verbose (file system, clock, env vars, ...). We do have flags to forward or exclude specific env vars, and I am already wondering if that was a mistake...
You can run Miri inside existing sandboxing solutions such as bubblewrap or firejail to achieve that kind of control. I am not sure if it is worth for us to re-implement all that logic, and I am worried that we might not be able to uphold the security promises implied by this. Miri is generally concerned with correctness of UB checking, so sandboxing is not out first concern usually. I am already worried people might rely too much on the default isolation mode for sandboxing -- it exists mostly to protect Miri from the environment (i.e., to make Miri fully deterministic), not to protect the environment from Miri.
We do have flags to forward or exclude specific env vars, and I am already wondering if that was a mistake...
Maybe this is a separate issue, but has this been useful outside of the TERM issue that we recently solved? Just trying to get some idea of if there would be pain if we removed this.
@RalfJung I see. Speaking for myself, I always understood miri's isolation as just a mean to provide determinism, instead of a security mechanism. And I wouldn't run with miri code I wouldn't run otherwise. This is more about letting my code use things I know wouldn't introduce non-determinism. /dev/null being a good example.
Maybe we could at least emulate /dev/null and similar files in miri?
Maybe this is a separate issue, but has this been useful outside of the TERM issue that we recently solved?
Not that I know of.
Maybe we could at least emulate /dev/null and similar files in miri?
Hm... I guess we could consider some files to be essentially like shim functions provided by the host system, and provide 'fake' implementations of those files when isolation is enabled -- that does not seem completely ridiculous. It sounds in line with the clock PR by @pvdrz.
Would this fake implementations be built-in inside miri? Or should this be something the users can customize?
If it's the former, what would be a clear way to decide which files are essential and which aren't?
We should discuss user-provided shims in another issue.
What we provide, with isolation on, should be totally deterministic and correct. So /dev/null is a candidate because you can't observe the environment through it. I think.
If it's the former, what would be a clear way to decide which files are essential and which aren't?
Basically anything that POSIX defines as a magic well-known file is a candidate, in principle.
Just spitballing here though, not sure yet if all this is really a good idea.^^
Yeah, the isolation-compatible clock enables running a lot of code that we couldn't before, and code that would reliably change between runs due to the different times pulled from the system clock. I am not sure the same logic applies to magic files. I don't think people depend on them very often.
Also, I suspect a program which does I/O to a magic file also wants other files... though I am aware I have made arguments of this form before and later corrected myself.
For example, I can imagine a test suite where the author is sure that some file always has some specific contents. Unfortunately we writing the interpreter aren't sure of that (also, what if they're wrong) so we can't really extend that guarantee into the implementation. So in such a case I think that the only fair suggestion is to avoid doing the I/O if you want isolation. For example, storing the contents you would read from the file in a const.
I don't think people depend on them very often.
I did expect this to be a niche use case. :sweat_smile: In my case, I have unittests for interfaces that involve OwnedFd. I only need to create a arbitrary fd, so I chose /dev/null. The tests don't depend on the content of the file.
I suspect a program which does I/O to a magic file also wants other files.
Can we just extend this to magic files for now? Do you consider implementing this not very cost effective because no many people depends on this?
Do you have an example program or test suite that you want to make work with isolation on? I can do some dirty hacking pretty quickly if I know what the target is to assess how much code this will take.
I have an interface that uses OwnedFd, and I use a mock of that interface in testing, to test something that is built on top of the interface. Sorry about talking in such abstract terms.
So from miri's perspective, it's opening /dev/null, dup the file descriptor then closing it.
fn main() {
use std::os::unix::io::*;
let f = std::fs::File::open("/dev/null").unwrap();
let f: OwnedFd = f.into();
let f2 = f.as_fd();
let f2 = f2.try_clone_to_owned().unwrap();
drop(f2);
drop(f);
}
I am not very interested in canned examples, my general goal is to empower people to use Miri on existing code with as little modifications as possible. If you can provide a link to a Rust project which has an extensive test suite, that would be ideal.
I could imagine a similar situation as num_cpus where it reads some files to figure out if the cgroups are restricting the number of cores. But in that specific case is not that important because num_cpus then does something else if it cannot read the file.
At the same time I think that wouldn't count as one of unix magic files so not the best example.
I agree with @saethlin in the fact that we should implement this behavior if it allows people to use miri without having to use extra flags and stuff.
I am not very interested in canned examples
This is not a canned example, it is distilled from a code base I am currently working on. It's not yet suitable to be shared at the moment unfortunately.
Probably poor word choice on my part. I'm not that interested in distilled examples either.
In any case, it looks like POSIX requires 3 special files:
The following files shall exist on conforming systems and shall be both readable and writable: /dev/null An empty data source and infinite data sink. Data written to /dev/null shall be discarded. Reads from /dev/null shall always return end-of-file (EOF). /dev/tty In each process, a synonym for the controlling terminal associated with the process group of that process, if any. It is useful for programs or shell procedures that wish to be sure of writing messages to or reading data from the terminal no matter how output has been redirected. It can also be used for applications that demand the name of a file for output, when typed output is desired and it is tiresome to find out what terminal is currently in use.
The following file shall exist on conforming systems and need not be readable or writable: /dev/console The /dev/console file is a generic name given to the system console (see System Console). It is usually linked to an implementation-defined special file. It shall provide an interface to the system console conforming to the requirements of General Terminal Interface.
Luckily, POSIX specifies the behavior of /dev/null so we could implement that. Then /dev/tty and /dev/console are specified to exist, but the description of /dev/tty says "if any" so maybe we could use the same behavior as /dev/null and pretend there is no associated terminal? And /dev/console isn't required to be readable or writable so we could have a second special handle that just returns EPERM from every operation on it.
So that's only 2 special handle types. Maybe I can search for crates that try to open /dev/null...
(also it might be good to change the title of this issue, this discussion isn't really about fine-grained isolation)
In any case, it looks like POSIX requires 3 special files:
Of these 3, I think only /dev/null makes sense for now, until we have a better idea of usecases for the other 2.
I had commented in discord about having a filesystem abstraction, which is basically the end state of this issue for fs isolation.
Fwiw, there are crates that purport to have higher-level fs abstractions (https://github.com/iredelmeier/filesystem-rs) and there is always stealing FUSE's api.
For my needs, I want to use miri as the basis of a simulation engine ala foundationdb. It is so close, it just needs a couple of more abstractions and knobs to control implementation for things that interact with the host (clock, fs, network, entropy).
I don't quite see the relationship between Miri and foundationdb. But anyway, given that all extra code comes with a cost, I don't think we want to currently carry any code that is not directly useful for Miri's primary purpose -- detecting UB in Rust code.