syn
syn copied to clipboard
lookahead1 is_empty, or Peek for Nothing
Ideally I would be able to say
let la = input.lookahead1();
if la.peek(...) {
...
} else if la.peek(syn::parse::Nothing) {
OR
} else if la.is_empty() {
handle the empty input case
} else {
return Err(la.error())
}
But Lookahead doesn't have an is_empty() method. And .peek doesn't work, I think just because Nothing isn't Copy.
I can use input.is_empty() but of course that doesn't note in the error message that empty input was permissible.
This would probably be achievable without breaking changes by adding a NothingToken(TokenMarker) type which just checks for EOF? maybe it could even be added to the Token! macro as Token![].
This would probably be achievable without breaking changes by adding a
NothingToken(TokenMarker)type which just checks for EOF? maybe it could even be added to theToken!macro asToken![].
I'd very much like something like that but, @dtolnay repeatedly argued against adding a way to peek for EOF, e.g., #1161 and https://github.com/dtolnay/syn/issues/889.
AFAICT it would be trivial for syn to add an EOF token that can be peeked, I had proposed a way here, https://github.com/dtolnay/syn/issues/1161#issuecomment-1792548261, but doing that outside of syn is obviously out of any stability guarantees, as it uses private traits (as a matter of fact the implementation I proposed no longer compiles).
I guess another solution would be to allow people to implement Peek themselves, but AFAIK that is not a desired thing right now.
Yeah peeking is simply not appropriate for EOF. peek2 and especially peek3 are already a problematic API, and shoehorning EOF into that exacerbates their issues significantly. I am open to exposing peek in a better way but just adding a trait impl is not it.
Peek is normally used for ascertaining whether a particular pattern of tokens would successfully parse if parsed from the input in a given order: if input.peek(Token![async]) && (input.peek2(token::Brace) || input.peek2(Token![move]) && input.peek3(token::Brace)). If this expression evaluates to true, then either parsing async followed by brace, or async followed by move followed by brace, must succeed. The input token tree including None-delimited groups can be async {} or «async {}» or «async» {} or async «{}» or «async» «{}» or ««async» {}» or ….
EOF is a different question than this because asking whether Cursor::eof can be true in a particular location is not a meaningful thing in the presence of None-delimited groups (consider input.peek(Token![async]) && input.peek2($eof) with something like «async» {}), and asking whether Nothing can be parsed at a particular location is especially not meaningful because it always can be parsed anywhere.
Ultimately, forking and parsing the tokens in the desired order is the foolproof way to ascertain whether they will parse successfully if parsed in that order.
I'm puzzled. You seem to be explaining that if there was a peek($eof), the forms peek2($eof) and peek3($eof) would go wrong.
But that doesn't seem to apply to input.is_empty()? Is there some reason why lookahead1.is_empty() would be a bad idea?
Right now I do this:
let la1 = input.lookahead1() {
if la1.peek(Ident) { /* stuff */ }
else if la1.peek(Lifetime) { /* stuff */ }
else if input.is_empty() { /* use default data or something */ }
else { Err(la1.error()) }
But this produces inaccurate error messages because the Lookahead1 "didn't see" me test for emptiness.
Lookahead1::is_empty would solve my problem, I think.
My last comment was in response to ModProg's ping and "I'd very much like something like that" (NothingToken) and "AFAICT it would be trivial for syn to add".
lookahead1.is_empty() is fine as long as it correctly suggests the right closing delimiter from the surrounding context, instead of just putting "or end of input" into error messages. Similar to these rustc errors:
error: expected one of `.`, `;`, `?`, `}`, or an operator, found `@`
--> src/lib.rs:1:19
|
1 | fn f() -> i32 { 1 @ }
| ^ expected one of `.`, `;`, `?`, `}`, or an operator
error: expected one of `,`, `.`, `;`, `?`, `]`, or an operator, found `@`
--> src/lib.rs:2:25
|
2 | fn g() -> [i32; 1] { [1 @] }
| ^ expected one of `,`, `.`, `;`, `?`, `]`, or an operator
My last comment was in response to ModProg's ping and "I'd very much like something like that" (
NothingToken) and "AFAICT it would be trivial for syn to add".
Sorry, my comment was probably snarkier than it had to be, I did not consider non-delimited groups. If I understand that correctly, this would always be a problem while parsing, right? If I have «|async» fn (» being the none delimited group, and | the current parse position), peek2(Token![fn]) would return false, right? I wonder if a better behavior would be to ignore the non-delimited group, though there are situations where correct handling of non-delimited groups is relevant (IIRC that is mainly for precedence in expressions).
I think having an Eof token that behaves in peek like Token![fn] should not be any more problematic. Though as Eof would not be a useful in an error message, an Eof token is probably not part of a solution for the lookahead issue.
lookahead1.is_empty()is fine as long as it correctly suggests the right closing delimiter from the surrounding context, instead of just putting "or end of input" into error messages. Similar to these rustc errors:
Maybe is_empty(delimiter) could be an option, where delimiter needs to be able to convey })] and something like end of macro input as a proc-macro doesn't know which delimiter started it.
If I have
«|async» fn(»being the none delimited group, and|the current parse position),peek2(Token![fn])would returnfalse, right?
I think it's impossible to have a ParseStream with the tokens you showed and with the cursor at the position you showed.
The issue is rather |«async» fn in which the naïve eof token that has been thrown around (with fn peek(cursor) -> bool { cursor.eof() }) would report peek2(eof) is true.
The issue is rather
|«async» fnin which the naïve eof token that has been thrown around (withfn peek(cursor) -> bool { cursor.eof() }) would reportpeek2(eof)is true.
After looking at both the peek2 and Cursor impls, I understand what you mean, peek2 will enter a non-delimited group transparently, and Cursor::ident etc. will exit non-delimited groups automatically, while Cursor::eof does not.
So to support a peek like that, one would need to handle non-delimited groups explicitly (I think one would need to call ignore_none() first).
#1625 solved the problematic case described in https://github.com/dtolnay/syn/issues/1550#issuecomment-2017018788.
Separately, #1680 + #1681 added a way that Lookahead1 will be able to report the correct closing delimiter at the end of a group per https://github.com/dtolnay/syn/issues/1550#issuecomment-2016891573.
I think this is ready to move forward if someone can provide realistic example code for the docs. Preferably 2 examples: one with peek2, and one with lookahead1 where we can demonstrate the resulting error message.