rfcs
rfcs copied to clipboard
RFC: proc macro `include!`
Allow include! to be implemented in proc macros, by adding a proc_macro API to read files as Vec<u8>, String, or TokenStream. If the file is read as TokenStream, it is given Spans appropriate for diagnostics to point into the read file. In all cases, the build system knows which file(s) have been read, and can cache results / rerun the macro as desired.
we'd likely also want a way to list files in a directory, though that may be more difficult to integrate into build systems
we may want to specify that a non-existent file/directory produce a NotFound error just like File::open, cuz proc macros are likely to want to probe for specific files (e.g. a config override file for a particular directory) and fall back to some default if they don't exist.
It would be useful to somehow allow Spans into include_str files as well, so the proc-macro can report errors (or induce later rustc errors) that reference into non-TokenStream compatible syntaxes.
allow
Spansintoinclude_strfiles as well
I agree that this is desirable. As such, I'm torn on whether include_str should return (logically) (String, Span). However, there's currently no (even unstable) way for a proc macro to split a given Span into smaller parts, so even with a Span, error reporting into a Rust lexer incompatible file wouldn't yet be possible.
This RFC I'd like to keep focused on the "read file via the build system" architecture, so introducing split spans architecture would (imo) overextend the RFC.
Perhaps the best short term approach is to just drop include_str, and have proc macros include_bytes and String::from_utf8 for the time being until include_str can give a useful Span.
Two points:
-
How should relative paths be handled? relative to the pwd of the rustc process? or how
include_string!, which is relative to the file the.rsis contained within. This becomes especially interesting when the proc macro is called with a relative path, it would be weird for users if that worked differently frominclude_string!(say ainclude_cpp!proc macro). Maybe one could pass an optional span to the functions and it would resolve the directory relative to the span's location, and if no span is passed, it would resolve relative to the top level span of the crate, or in other words, relative to lib.rs. -
I'm wondering about making the return for
include_bytesandinclude_stropaque, or at least using something that supports the backing buffers coming from a mmap call instead of having to use the standard Rust allocator. Think of instances where the included files are, say, 4 GiB large or something. Reading the entire file into memory is very inefficient in that instance and ideally one ceases to read the complete file until the linking step. This optimization does not have to be implemented, but I feel that it should be possible to implement it in the future. Also, maybe the implementation only wants to read parts of a file, in which instance reading the entire file and copying it to RAM is wasteful. Mapping it to memory reads only the needed parts of it.
- I'm wondering about making the return for
include_bytesandinclude_stropaque, or at least using something that supports the backing buffers coming from a mmap call instead of having to use the standard Rust allocator.
How about:
trait BytesBuf: AsRef<[u8]> {
// may be expensive
fn into_vec(self: Box<Self>) -> Vec<u8>;
}
fn include_bytes<P: AsRef<str>>(path: P) -> Result<Box<dyn BytesBuf>, std::io::Error>;
I would really like for the proc macro version of include_str! (and probably include_bytes! as well, for consistency) to return a Span for the included string. See rust-lang/rust#92565.
This would open tonnes of doors and allow extending the rust language to work with things like single-file components in frontend frameworks written in rust.
I've finally gotten around to updating the RFC text for the comments here. Changelog:
include_strnow producesLiteralinstead ofString- Mention questions brought up as well as the alternative of more specialized wrapper types for returns.
I've finally gotten around to updating the RFC text for the comments here. Changelog:
* `include_str` now produces [`Literal`](https://doc.rust-lang.org/nightly/proc_macro/struct.Literal.html#) instead of `String` * Mention questions brought up as well as the alternative of more specialized wrapper types for returns.
Does this allow you to split the Literal into Spans?
Edit: nvm I see there is a subspan function
we'd likely also want a way to list files in a directory, though that may be more difficult to integrate into build systems
how is this not already possible with Span::call_site().source_file()?
we'd likely also want a way to list files in a directory, though that may be more difficult to integrate into build systems
how is this not already possible with
Span::call_site().source_file()?
listing files is already possible using std::fs, the issue is that since cargo doesn't know about that, it won't rerun if you add new files to that directory, or delete, or modify files, or modify file attributes. therefore I think the proc-macro API (and probably something for build.rs too) should include functions that inform cargo that you depend on certain directories so it'll re-run if you change them.
we'd likely also want a way to list files in a directory, though that may be more difficult to integrate into build systems
how is this not already possible with
Span::call_site().source_file()?
Additionally the path() part has been nightly only since the dawn of time, so not something we've been able to use yet in stable
(This would also likely be blocked in unstable limbo by the same concerns as the tracked path interface.)
FWIW, the tracked_path API now supports tracking any changes within a directory. The change happened alongside the buildscript rerun-if getting that functionality to replace the older, mostly useless behavior of watching the directory entry itself for (i.e. metadata) changes.
The "perfect" solution (with respect to tracking only) is to use a WASI target or similar in order to instrument all environment access, such that it can be transparently instrumented, sandboxed, and whatever else the compiler sees as reasonable. For what this RFC is directly trying to address — spanned manipulation of newly accessed files — though, this API surface is still required even with perfect instrumentation of environment access.