JuliaSyntax.jl
JuliaSyntax.jl copied to clipboard
`i++` could use a better error message
I'm opening this issue based on a question on discourse.
Consider the following code:
i = 0
for n = 1:5
i++
end
This will result in (at least on Julia 1.7.0)
ERROR: syntax: unexpected "end"
This error message could benefit from JuliaLang/julia#45791. However, I'm doing a separate issue for this special case because a new user might write i++ not knowing it is not equivalent to i += 1 (or even that ++ is parsed as an infix operator). Would it be possible to have a more specific error message? Something like
ERROR: syntax: attempted to call an infix operator with just one argument - perhaps you meant "i += 1"?
Moved my comment to the other issue where it is more directly relevant. I doubt we can do anything specific to ++ here since there's no guarantee you'll have an end next, e.g.
i = 0
for n = 1:5
i++
f(i)
end
Couldn't we add some specific error handling when seeing the ++ token and it isn't a defined variable?
since there's no guarantee you'll have an
endnext
Right, but in that case you get ERROR: UndefVarError: ++ not defined, which I think is clear enough. People (likely beginners) who intended to do i += 1 will know that i++ isn't allowed in that sense, and people (likely more advanced users) who intended to use ++ as an operator will know to define it.
When the unexpected "end" error is about to be thrown, would it be possible to look at the previous token, check if it is ++, and if so throw a more informative error?
Couldn't we add some specific error handling when seeing the
++token and it isn't a defined variable?
Nothing is defined at parse time, so it wouldn't be possible to know if ++ is defined while parsing. But during runtime an UndefVarError is thrown.
When the
unexpected "end"error is about to be thrown, would it be possible to look at the previous token, check if it is++, and if so throw a more informative error?
Rephrasing in a way I'm guessing would be easier to implement: When an operator is the current token, would it be possible to check if the operator is ++, and if so peek at the next token, see if it is unexpected (whether it be end or ] or whatever else), and if so throw a more informative error?
a new user might write i++ not knowing it is not equivalent to i += 1 (or even that ++ is parsed as an infix operator)
I'm not sure what the parsing issue is, if it's because ++ is technically legal infix operator, then it's at least very uncommon to use, so could at least the Linter (and therefore in VS Code, by default, I think) catch this (by default)? Ideally, Julia would too.. and not have defined it infix, but in case you want that dropped, then it would need to wait for 2.0...
julia> for n = 1:5
i++
end
ERROR: ParseError:
# Error @ REPL[17]:3:1
i++
end
└─┘ ── invalid identifier
Stacktrace:
[1] top-level scope
@ none:1
This error message certainly is better (at least now we know what the offending end is), but I don't think it addresses this issue fully. Some questions remain (for someone who thinks i++ is the same as i += 1, or who doesn't know ++ is an infix operator), such as, "Why is end being treated as an identifier here?"
I want to avoid writing a lot of hand-engineered error messages for special cases which is why I haven't pushed very hard on fixing this specific issue (sorry!)
But I agree we should be able to do so much better here. Let's keep this open as a reminder (along with several similar issues).
The main difficulty with error messages right now is that the existing parser structure/abstractions are kind of inadequate for good error reporting. Why? Well error recovery is inherently about making a statistical inference about the user's intentions. Doing this well depends on things we traditionally try to abstract away from a parser. Things like the precise amount of whitespace, the names of identifiers, etc etc. So I think we need a big rethink of how this is going to work before we start adding lots of detailed error reporting for particular special cases.
I want to avoid writing a lot of hand-engineered error messages for special cases which is why I haven't pushed very hard on fixing this specific issue (sorry!)
No need to apologize, and I totally get the sentiment about hand-engineered cases.
the names of identifiers
I hadn't realized the parser didn't have this info. That does make it more difficult!