peginator icon indicating copy to clipboard operation
peginator copied to clipboard

Confusing error message in case of a syntax error

Open ethindp opened this issue 2 years ago • 4 comments

I'm very confused. The peginator-cli binary is giving me this error:

Error: expected end of input
--> ada.peg:6801:1
 |  
 |  
 |  ^

But my grammar certainly isn't done on this line (the number of lines are primarily because I generated the unicode character classes that I needed to match.... Probably not the "best" way I could do it, but it (should) work.) Anyway, it's getting stuck on this:

@string
@check(crate::checks::check_identifier)
Identifier =
IdentifierStart {IdentifierStart | IdentifierExtend} !(Abort | Abs | Abstract | Accept | Access | Aliased | All | And | Array | At | Begin | Body | Case | Constant | Declare | Delay | Delta | Digits | Do | Else | Elsif | End | Entry | Exception | Exit | For | Function | Generic | Goto | If | In | Interface | Is | Limited | Loop | Mod | New | Not | Null | Of | Or | Others | Out | Overriding | Package | Parallel | Pragma | Private | Procedure | Protected | Raise | Range | Record | Rem | Renames | Requeue | Return | Reverse | Select | Separate | Some | Subtype | Synchronized | Tagged | Task | Terminate | Then | Type | Until | Use | When | While | With | Xor);

@char
IdentifierStart =UppercaseLetter
| LowercaseLetter
| TitlecaseLetter
| ModifierLetter
| OtherLetter
| LetterNumber;

@char
IdentifierExtend =NonspacingMark
| SpacingMark
| DecimalNumber
| ConnectorPunctuation;

NumericLiteral = @:DecimalLiteral | @:BasedLiteral;
# Dies where this comment is
@skip_no_ws
@check(crate::checks::check_decimal_literal)
DecimalLiteral = integer_part:Numeral ["." fractional_part:Numeral] [exponent:Exponent];

@string
@skip_no_ws
Numeral = Digit {["_"] Digit};

@string
@skip_no_ws
Exponent = i'E' [Add] Numeral | i'E' Subtract Numeral;

@char
Digit = '0' .. '9';

@check(crate::checks::check_based_literal)
BasedLiteral =
base:Base "#" integer_part:BasedNumeral ["." fractional_part:BasedNumeral] "#" [exponent:Exponent];

@string
@skip_no_ws
Base = Numeral;

@string
@skip_no_ws
BasedNumeral =
ExtendedDigit {["_"] ExtendedDigit};

@char
ExtendedDigit = Digit | i'A' .. i'F';

At a glance I can't see any kind of syntax error, so I'm quite confused. Does order of the rules matter? (I hope not...)

ethindp avatar Aug 23 '23 20:08 ethindp

The order of the rules don't matter. The error is very misleading though, the problem is that the directive is called @no_skip_ws, and not @skip_no_ws

badicsalex avatar Aug 23 '23 21:08 badicsalex

@badicsalex Post-processing on directives and rejecting unrecognized directives and saying "Unknown directive x" would be better.

ethindp avatar Aug 23 '23 22:08 ethindp

Also, the error message when misusing @ overrides doesn't give you any line information. E.g.: I get Error: Enum '@:' fields have to be used exactly once in all choice branches, and must not be used in closures or optional parts, but this tells me nothing about where the error is, just that it exists. Would it be possible to include error information with these?

ethindp avatar Aug 23 '23 22:08 ethindp

Notes to myself (or anyone who's willing to fix this):

This error is caused by the structure of the peginator grammar, namely the first rule:

Grammar = {(rules:Rule | rules:CharRule | rules:ExternRule) ";"} $ ;

Parsing the rule list stops at the first weird directive, but is successful otherwise. Then comes the EOI check, and that throws an error. Unfortunately the EOI check happens at the same place as the first parsing failure (the position of the "@", since the directive check strings contain the @ already).

If there are multiple furthest errors, all of them should be stored. But this should be opt-in, as this will mean a lot of allocations, and keep in mind that ParseState is cloned a lot, so this would severely affect parsing speed. Maybe a Cow<Vec<>> (or an optimized small vec crate with cow semantics) would be appropriate.

badicsalex avatar Aug 24 '23 07:08 badicsalex