grmtools icon indicating copy to clipboard operation
grmtools copied to clipboard

Apparently infinite recursive rule

Open ratmice opened this issue 2 years ago • 5 comments

One of the "fun" things about my project is running the parser on strange, half edited/incomplete changes. Here is one such case I encountered that way, and have minimized.

given the input character a, this will cause an infinite loop pushing to pstack between https://github.com/softdevteam/grmtools/blob/master/lrpar/src/lib/parser.rs#L297 https://github.com/softdevteam/grmtools/blob/master/lrtable/src/lib/statetable.rs#L461-L466 adding a case like: Some(i) if i == usize::from(stidx) + 1 => None, to goto fixes it, (i.e. the return value of goto == prior).

Filing this as a bug report rather than sending a PR though, because I haven't yet tested it against valid parsers, or as of yet tried to surmise if this case can only and always lead to infinite recursion or if it ever actually comes up in a valid way.

%%
a "a"
[\t\n ] ;
%%
Start: Bar;
Foo: "a" | ;
Bar: Foo | Foo Bar;

ratmice avatar May 12 '22 16:05 ratmice

From memory, yacc (or Bison maybe?) statically detects such cases and refuses to compile them (or, at least, issues a warning). If that memory is correct, we should consider doing something similar IMHO.

ltratt avatar May 12 '22 17:05 ltratt

I wrote a gist that contains, some graphs and a breakdown of the issue, some potential ways we could go about fixing it. the potential impacts of trying to do exactly what yacc/bison do here. I think the last potential fix mentioned (providing a mechanism to produce a pruned StateGraph), seems like a nice way to maintain the existing relation shared between StateTable and StateGraph's StIdx.

https://gist.github.com/ratmice/180cc7daa6e28a285885affd2e623f71

I mostly wanted to think through the semver ramifications of fixing this, and I think there are 2 options for fixing it with low impact to the code or semver, however changing the goto table the way yacc appears to seems to be quite impactful.

If you think that one of these low impact options is viable, I'm kind of inclined not to be in any rush to fix this, since running a parser with conflicts, without %expect, %expect-rr is probably unwise, and I'm fairly convinced this can only happen in the presence of conflicts.

ratmice avatar Aug 18 '22 08:08 ratmice

Broadly speaking I'm in favour of making these errors (with the minor caveat that I'd like https://softdevteam.github.io/grmtools/master/book/errorrecovery.html#turning-lexing-errors-into-parsing-errors (where we use an otherwise-unused rule) to remain possible, but there might be a better way of doing it!). I find it hard to imagine anyone writes a grammar intending to leave such problems in.

ltratt avatar Aug 19 '22 18:08 ltratt

I would think that perhaps a new declaration like %allow-unused Unmatched could be used to disable such an error (Or perhaps something which indicates its usage by error recovery?).

I'm mildly cognizant though of where it seems likely when you are initially writing a rule, you are less likely to have referenced within the rest of your grammar, which is kind of why I was thinking of the same way that conflicts() and missing_from_lexer/missing_from_parser returned from set_rule_ids, in that those don't actually produce a StateTableError, but are treated as warnings/errors by the individual tools.

One last thing, is I'm getting the impression that unused (due to conflicts) it seems like is just a way of deriving which state of the conflict was dropped (And I think in yacc just Shift/Reduce ones), I think it is basically able to be derived already from the information within Conflicts, since I believe it returns the PIdx of the production which was discarded during conflict resolution. So it seems like we might have everything to produce that within the tools already.

ratmice avatar Aug 19 '22 19:08 ratmice

I would think that perhaps a new declaration like %allow-unused Unmatched could be used to disable such an error

This feels quite yacc-y to me! I can certainly live with that.

[I must admit that the other parts of statetable generation / analysis have receded so far into the back of my brain that I can't remember how this edge-case stuff might work!]

ltratt avatar Aug 19 '22 20:08 ltratt