Mismatched and expecting the same token
Hello, I have been testing in the ANTLR LAB and I cannot understand this error. The screenshot I attach below shows a simple grammar and an error indicating that the token it found is different from the one it was expecting, but it is the same, it doesn't make sense. Why is it like this? Thank you
The example grammar: grammar Example;
class: ABSTRACT 'class' IDENTIFIER;
IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*; ABSTRACT: 'abstract';
// Ignore WS: [ \r\n\t]+ -> skip;
The example input: abstract class Plane
It looks like the keyword abstract is being recognized as IDENTIFIER because IDENTIFIER has higher priority (it's written above ABSTRACT). To resolve the issue use can try to swap these tokens:
ABSTRACT: 'abstract';
IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*;
Okay, thank you, so when a lexicon is a constant word, would it have to be above lexicons that are regular expressions?
It's not about constant words, it's about order. IDENTIFIER also can recognize absatract sequence and it's placed before ABSTRACT token. That's why it has higher priority and actually ABSTRACT token is unreachable. ANTLR lexer tries to recognize the longest sequence at first and if there are two sequences of the same lenght it peeks the token written at the top position.
But you question is reasonable and warning should be reported here to avoid confusion (unreachable token). We have an issue on this: https://github.com/antlr/antlr4/issues/1072
Am i stupid or something? Even when putting ABSTRACT before IDENTIFIER i still get the same error.
grammar test;
class: ABSTRACT 'class' IDENTIFIER;
ABSTRACT: 'abstract';
IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*;
// Ignore
WS: [ \r\n\t]+ -> skip;
I've been trying to use antlr4 for several days and always run into this issue. Is there something I'm missing?
Even when putting
ABSTRACTbeforeIDENTIFIERi still get the same error.... grammar test;
You are using lab.antlr.org with a combined grammar in which you overwrote the sample split grammar. Split grammars have separate grammars for lexer and parser. Combined grammars have only one grammar (i.e. not "parser grammar testParser; ..." and "lexer grammar testLexer; ...". In the upper left of the UI go thru each grammar by selecting the tab above the window for the grammar. Erase the lexer grammar and make sure you have only a parser grammar. Then try again.
This happens a lot. The UI should have a checkbox to switch between the two and nuke the lexer tab if combined. Or it should be adaptive in presenting one or two tabs depending on the grammar declaration.
jesus christ thanks!!! this was driving me nuts... cant believe i missed that 😅