JuliaSyntax.jl icon indicating copy to clipboard operation
JuliaSyntax.jl copied to clipboard

Error recovery for unexpected continuation keywords

Open pfitzseb opened this issue 3 years ago • 1 comments

Consider

julia> JuliaSyntax.parse(JuliaSyntax.GreenNode, "if true; x ? true : elseif true end")[1]
     1:35     │[toplevel]
     1:35     │  [if]
     1:2      │    if
     3:3      │    Whitespace
     4:7      │    true                 ✔
     8:26     │    [block]
     8:8      │      ;
     9:26     │      [if]
     9:9      │        Whitespace
    10:10     │        Identifier       ✔
    11:11     │        Whitespace
    12:12     │        ?
    13:13     │        Whitespace
    14:17     │        true             ✔
    18:18     │        Whitespace
    19:19     │        :
    20:20     │        Whitespace
    21:26     │        [error]           ✘
    21:26     │          elseif         ✔
    27:27     │    Whitespace
    28:31     │    [error]               ✘
    28:31     │      true               ✔
    32:32     │    Whitespace
    33:35     │    end

This special case is fixed by #77 by punting the elseif into the containing block instead:

julia> JuliaSyntax.parse(JuliaSyntax.GreenNode, "if true; x ? true : elseif true end")[1]
     1:35     │[toplevel]
     1:35     │  [if]
     1:2      │    if
     3:3      │    Whitespace
     4:7      │    true                 ✔
     8:19     │    [block]
     8:8      │      ;
     9:19     │      [if]
     9:9      │        Whitespace
    10:10     │        Identifier       ✔
    11:11     │        Whitespace
    12:12     │        ?
    13:13     │        Whitespace
    14:17     │        true             ✔
    18:18     │        Whitespace
    19:19     │        :
    20:19     │        error             ✘
    20:20     │    Whitespace
    21:32     │    [elseif]
    21:26     │      elseif
    27:27     │      Whitespace
    28:31     │      true               ✔
    32:32     │      [block]
    32:32     │        Whitespace
    33:35     │    end

but of course that naive solution only works if there is only one missing or extraneous token, so "if true; x ? true : foo ))))) elseif true end" will break it again.

Generally, this should be solvable by an arbitrarily long look-ahead for continuation keywords, but I really don't like that solution (and it might not even work in all cases).

pfitzseb avatar Aug 31 '22 11:08 pfitzseb

Generally, this should be solvable by an arbitrarily long look-ahead for continuation keywords

If we cap the lookahead at some large but not huge value I feel this is acceptable. But recovery and error reporting really is hard: in general it requires both look ahead and look behind to do a good job.

Any parsing algorithm which assumes the source is "well-formed by default" is likely to enter weird states where there's not enough local information available in the program to emit the ideal error. Hence wanting #93 or something similar... (really, I believe parsing of broken text and emitting error messages and warnings are in the "big pile of heuristics" category of software which in general is better learned with a data driven ML approach... but that would be a research project ...)

c42f avatar Sep 13 '22 02:09 c42f