chevrotain
chevrotain copied to clipboard
Recovery at end of input not working?
I'm trying to get recovery to insert a token at the end of the input but it doesn't appear to be working. Here's a really simple grammar to parse function calls like foo()
. I want it to also parse foo(
using token insertion. Am I doing something wrong?
(function expressionExample() {
// ----------------- Lexer -----------------
const createToken = chevrotain.createToken;
const Lexer = chevrotain.Lexer;
const Identifier = createToken({name: "Identifier", pattern: /[a-zA-Z]+/});
const LCurly = createToken({name: "LCurly", pattern: /\(/});
const RCurly = createToken({name: "RCurly", pattern: /\)/});
const expressionTokens = [Identifier, LCurly, RCurly];
const ExpressionLexer = new Lexer(expressionTokens, {
positionTracking: "onlyStart"
});
// Labels only affect error messages and Diagrams.
LCurly.LABEL = "'{'";
RCurly.LABEL = "'}'";
// ----------------- parser -----------------
const Parser = chevrotain.Parser;
class ExpressionParser extends Parser {
constructor() {
super(expressionTokens, {
recoveryEnabled: true
})
const $ = this;
$.RULE("expression", () => {
$.CONSUME(Identifier);
$.CONSUME(LCurly);
$.CONSUME(RCurly);
});
// very important to call this after all the rules have been setup.
// otherwise the parser may not work correctly as it will lack information
// derived from the self analysis.
this.performSelfAnalysis();
}
}
// for the playground to work the returned object must contain these fields
return {
lexer: ExpressionLexer,
parser: ExpressionParser,
defaultRule: "expression"
};
}())
Hi @tlrobinson
The simple examples certainly helps. 👍
I don't think the recovery logic handles the edge case of EOI.
canRecoverWithSingleTokenInsertion(
this: MixedInParser,
expectedTokType: TokenType,
follows: TokenType[]
): boolean {
if (!this.canTokenTypeBeInsertedInRecovery(expectedTokType)) {
return false
}
// must know the possible following tokens to perform single token insertion
if (isEmpty(follows)) {
return false
}
let mismatchedTok = this.LA(1)
let isMisMatchedTokInFollows =
find(follows, (possibleFollowsTokType: TokenType) => {
return this.tokenMatcher(mismatchedTok, possibleFollowsTokType)
}) !== undefined
return isMisMatchedTokInFollows
}
So to perform single token Insertion the encountered token must match a possible NEXT token. This condition is met in your scenario:
-
foo
(
EOF -
foo
(
)
EOF
However I do not believe EOF is counted as part of the possible next tokens. As it is an implicit EOF.
I've tried to explicitly add a CONSUME(chevrotain.EOF) at the end of the rule but without luck. I guess I need to debug this in more depth, I'll update when I find out more.
All-right, I've debugged this again but this time using a full dev env instead of the playground.
Adding an EOF token explicitly seems to resolve the problem.
$.RULE("expression", () => {
$.CONSUME(Identifier)
$.CONSUME(LCurly)
$.CONSUME(RCurly)
$.CONSUME(chevrotain.EOF)
})
- Note the EOF should be consumed at the top level rule (entry point) of your grammar.
It is possible to make a patch infer the existence of EOF as a "possible next token" in such a case, however because EOF is implicit it is a tiny bit complicated and may not be warranted or high priority when a simple workaround is available...