const-eval
const-eval copied to clipboard
Reading files from within const eval
(Filing this at @oli-obk's request, based on discussions on Zulip.)
We should be able to read files within const eval. This is equivalent to doing include_bytes! (which we already support at compile time), and should use the same machinery for "rebuild if this changes".
@oli-obk suggested that this could work by tagging specific portions of low-level file I/O as something akin to lang items, and then implementing that low-level file I/O by invoking the machinery that powers include_bytes!.
by invoking the machinery that powers
include_bytes!.
It's not yet possible to share the machinery between include_bytes and the hypothetical machinery for const eval reading files, because include_bytes runs at a time where there are no queries available yet. In order for const eval to be able to read from files, we need to add a query that goes from PathBuf to &'tcx [u8] by reading the entire file in one go and interning it. This means that if you have a
const fn foo() -> String {
File::open("foo.txt").unwrap().read_to_string()
}
you will get different results when calling this function from different crates, because each crate will resolve foo.txt to a different file (to this_crate_root/foo.txt, so right next to Cargo.toml).
I don't think we need to take care to ignore the target directory, because include_bytes can also read from that. Since we're caching all read files via the query system, we won't ever get a situation where the files differ between each read.
Two important thing we need to take care of:
- Prevent any non-relative paths or relative paths that go outside the crate root. We can have different schemes to do that. One variant would be to treat the crate root as the file system root.
- Ensure the query is never cached in the incremental cache. Even if none of its inputs or input queries change we need to reread the entire file in order to check if the file changed.
In order for const eval to be able to read from files, we need to add a query that goes from (CrateNum, PathBuf) to &'tcx [u8] by reading the entire file in one go and interning it. This means that if you have a
That would take care of dynamically computing the filename. However, @joshtriplett also asked to be able to read parts of a file without reading the whole file, which this would not do.
Prevent any non-relative paths or relative paths that go outside the crate root. We can have different schemes to do that. One variant would be to treat the crate root as the file system root.
Why that? Does include_bytes! do that? Certainly build.rs and proc macros can read anywhere on the file system... (IMO they should be sandboxed but that's a long and separate discussion.^^)
Two important thing we need to take care of:
- Ensure that the query system never recomputes this query. Currently, for many queries dropping the cache is okay because the queries are all pure functions that can be recomputed any time. Reading from a file might be the first query to not have that property, so the entire query system needs to cooperate (and future changes in the query system need to take this into account).
Why that? Does include_bytes! do that? Certainly build.rs and proc macros can read anywhere on the file system... (IMO they should be sandboxed but that's a long and separate discussion.^^)
Because... we can do it? No need to start out a new feature without sandboxing if we can sandbox it from the start
Ensure that the query system never recomputes this query. Currently, for many queries dropping the cache is okay because the queries are all pure functions that can be recomputed any time. Reading from a file might be the first query to not have that property, so the entire query system needs to cooperate (and future changes in the query system need to take this into account).
That's actually fine, because if recomputing the query causes the result to differ, it will poison all queries that called this query and force them to get recomputed, too. So you don't get any behaviour where if you have a dataflow from such a file to an array length, the array length changes within the same compilation.
On Wed, Jun 24, 2020 at 11:57:26PM -0700, Ralf Jung wrote:
In order for const eval to be able to read from files, we need to add a query that goes from (CrateNum, PathBuf) to &'tcx [u8] by reading the entire file in one go and interning it. This means that if you have a
That would take care of dynamically computing the filename. However, @joshtriplett also asked to be able to read parts of a file without reading the whole file, which this would not do.
That would be nice, but not necessary right away. I'd like the File
API to just work in const eval, but it's fine for now if it just reads
the entire file into memory and doesn't try to read only a subset.
Calling a const function at compile-time will always yield the same result as calling it at runtime, even when called multiple times. ~ https://doc.rust-lang.org/reference/const_eval.html#const-functions
How could this still hold with arbitrary file access which whose content can change at any point?