macro_railroad
macro_railroad copied to clipboard
Multicharacter tokens not parsed correctly
macro_rules! x {
(= >) => {
println!("Space");
};
(=>) => {
println!("No space");
};
}
fn main() {
x!(= >);
x!(=>);
}
The two branches are currently seen as identical, even before optimizing. This is not correct.
The problem here is that we parse (or lower) Punct incorrectly:
The two arms parse as (roughly)
Ok(MacroRules { name: Ident(x), rules: [Rule { matcher: [Punct(Punct { op: '=', spacing: Alone }), Punct(Punct { op: '>', spacing: Alone })], expansion: TokenStream [Ident { sym: println }, Punct { op: '!', spacing: Alone }, Group { delimiter: Parenthesis, stream: TokenStream [Literal { lit: "Space" }] }, Punct { op: ';', spacing: Alone }] }] })
Ok(MacroRules { name: Ident(x), rules: [Rule { matcher: [Punct(Punct { op: '=', spacing: Joint }), Punct(Punct { op: '>', spacing: Alone })], expansion: TokenStream [Ident { sym: println }, Punct { op: '!', spacing: Alone }, Group { delimiter: Parenthesis, stream: TokenStream [Literal { lit: "No space" }] }, Punct { op: ';', spacing: Alone }] }] })
Is case of = > the two Punctare Alone. In case of => the = is Joint, so it's combined with the >.
Macro_rules' concept of tokens is confusing and it's not just a matter of looking at whether the proc macro token is Joint or Alone:
macro_rules! x {
// These rules are always equivalent.
(=> >) => { println!("Space"); };
(=>>) => { println!("No space"); };
}
fn main() {
x!(=> >); // "Space"
x!(=>>); // "Space"
}
macro_rules! x {
// These rules are *not* equivalent.
(= >>) => { println!("Space"); };
(=>>) => { println!("No space"); };
}
fn main() {
x!(=> >); // "No space"
x!(=>>); // "No space"
}
They greedily left-to-right form groups of consecutive punctuation according to which multi-character punctuations are recognized by Rust's grammar, and then whitespace between groups is ignored. (This is a limitation that is fixed in the token API of procedural macros.) So for example =>> and => > are equivalent because they both group as => >, while = >> and =>> are not equivalent because =>> is grouped as => > which is different from = >>.