chevrotain icon indicating copy to clipboard operation
chevrotain copied to clipboard

Better error message for empty ambigious alternatives.

Open bd82 opened this issue 6 years ago • 4 comments

When the ambiguity is due to multiple empty alternatives The error message is still built as though there is an actual "real" path of terminals.

This could be slightly confusing...

See: https://github.com/SAP/chevrotain/blob/1f82f69d5be0d27d1ddddf724257b8ea97537afd/src/parse/grammar/checks.ts#L790-L797

bd82 avatar Jan 25 '18 19:01 bd82

Is an error message like this an example of the issue?:

    Ambiguous alternatives: <1 ,2> due to common lookahead prefix
    in <OR4> inside <element> Rule,
    <> may appears as a prefix path in all these alternatives.

I'm struggling to figure out what this means. It appears a though there is no common prefix path between these...which begs the question "why is that a problem"? Or is this indicative of some other issue?

mtiller avatar May 31 '22 00:05 mtiller

@mtiller Well the common prefix is the empty alternative, meaning that both alternatives are able to be parsed without any input. See this example:

$.RULE("A", () => {
  $.OPTION(() => {
    $.CONSUME(X);
  });
}

$.RULE("B", () => {
  $.OPTION(() => {
    $.CONSUME(Y);
  });
}

$.RULE("C", () => {
  $.OR([{
    ALT: () => $.SUBRULE($.A)
  }, {
    ALT: () => $.SUBRULE($.B)
  }]);
}

Although A and B are distinct, since they parse tokens X and Y respectively, they both also parse nothing (since the body is optional), making the alternative in C ambiguous.

Is an error message like this an example of the issue?

Yes, it definitely is.

msujew avatar May 31 '22 08:05 msujew

I understood your case would generate this kind of message. But when I looked at mine, it didn't feature two potentially empty alternatives. Instead, my first rule has several optional patterns before finally (and certainly) consuming a token while the second rule had an immediate token consume rule. Because the possible first tokens for the first rule are mutually exclusive from the mandatory first token in the second rule it appeared unambiguous to me, i.e., that you take the next token and if it matches the token of the second rule, use the second rule and it doesn't match, it must be the first rule.

However, that clearly isn't how the parser works. Fortunately, the solution was simple...I just reversed the two alternatives. With the first rule starting unconditionally with a specific token, the ambiguity goes away. I know that in the case of common prefixes order matters (longer potential match first). But I didn't think order would matter in this case. It might be useful to put a note in the error message about this.

mtiller avatar Jun 03 '22 02:06 mtiller

Hi @mtiller I am not sure why you are getting this misleading error message in this case. If you can provide a small example which reproduces the problem I can try debugging it to figure out if something can be improved...

bd82 avatar Jun 20 '22 21:06 bd82