proc-macro2
proc-macro2 copied to clipboard
Investigating compressing fallback spans into fewer bytes
The compiler's proc_macro::Span uses an efficient encoding in 4 bytes. Currently our span when procmacro2_semver_exempt is enabled is a whopping 12 bytes because it essentially looks like:
enum Span { // tag + padding = 4 bytes
Compiler(proc_macro::Span), // 4 bytes
Fallback(struct Span { // 8 bytes
lo: u32,
hi: u32,
}),
}
And it goes up to 16 bytes when embedded next to 8-byte pointers due to additional padding.
This has a performance impact because syntax trees contain lots of spans.
We need to investigate interning the spans or some other trick to cut the size.
Interestingly the Span type within rustc is actually 8 bytes long in its "compressed" form!
https://github.com/rust-lang/rust/blob/85a360e0ea2f1629b8851e7c9b2903bbdbab42bf/src/libsyntax_pos/span_encoding.rs#L13-L65
That being said, the spans which we get within proc-macro are, in fact, smaller than that. Thanks to the bridge, they are passed around as a NonZeroU32 handle type, which is stored in a handle table. https://github.com/rust-lang/rust/blob/85a360e0ea2f1629b8851e7c9b2903bbdbab42bf/src/libproc_macro/bridge/handle.rs#L9
The table is effectively a 2-way mapping from handle to value, and from value to handle. Something vaguely like the following:
struct SpanInterning {
data: BTreeMap<Span, SpanInfo>,
interner: HashMap<SpanInfo, Span>,
}
Interestingly, as Span is Copy, it could also be possible for us to store it within an untagged union and select which version to use based on the return value of nightly_works(), which combined with these changes could lead to 4-byte proc_macro2::Span values, although that may not be sound.
That seems promising. It would be good to benchmark the 4 byte approach using an intern table and nightly_works against an 8 byte approach laid out as:
- First 4 bytes: union { proc_macro::Span, hi }
- Second 4 bytes: lo, where u32::MAX means that we're a compiler span
because our lexer always produces lo and hi separated by the width of the token, so lo can't be u32::MAX.
That seems promising. It would be good to benchmark the 4 byte approach using an intern table and nightly_works against an 8 byte approach laid out as:
* First 4 bytes: union { proc_macro::Span, hi } * Second 4 bytes: lo, where u32::MAX means that we're a compiler spanbecause our lexer always produces lo and hi separated by the width of the token, so lo can't be u32::MAX.
Hmm, yeah that could work. We'd likely also want to have some sort of dynamic test failure in case rustc ever decides to change size_of::<Span>(). I wonder if it's worthwhile to make lo and/or hi be NonZeroU32 like proc_macro::Span is. IIRC we don't have Option<Span> often, so might not be meaningful savings.
I'm also pretty sure that, even if lo could be u32::max_value(), it wouldn't be an issue to lower the ceiling by one.
Syn has quite a few places that are Option<Token![ ]> where the token type only holds the token's span, so it would be good to use NonZeroU32 where possible.