antlr4 icon indicating copy to clipboard operation
antlr4 copied to clipboard

Mismatched and expecting the same token

Open josago97 opened this issue 2 years ago • 6 comments

Hello, I have been testing in the ANTLR LAB and I cannot understand this error. The screenshot I attach below shows a simple grammar and an error indicating that the token it found is different from the one it was expecting, but it is the same, it doesn't make sense. Why is it like this? Thank you

image

The example grammar: grammar Example;

class: ABSTRACT 'class' IDENTIFIER;

IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*; ABSTRACT: 'abstract';

// Ignore WS: [ \r\n\t]+ -> skip;

The example input: abstract class Plane

josago97 avatar Oct 15 '23 17:10 josago97

It looks like the keyword abstract is being recognized as IDENTIFIER because IDENTIFIER has higher priority (it's written above ABSTRACT). To resolve the issue use can try to swap these tokens:

ABSTRACT: 'abstract';
IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*;

KvanTTT avatar Oct 15 '23 18:10 KvanTTT

Okay, thank you, so when a lexicon is a constant word, would it have to be above lexicons that are regular expressions?

josago97 avatar Oct 15 '23 19:10 josago97

It's not about constant words, it's about order. IDENTIFIER also can recognize absatract sequence and it's placed before ABSTRACT token. That's why it has higher priority and actually ABSTRACT token is unreachable. ANTLR lexer tries to recognize the longest sequence at first and if there are two sequences of the same lenght it peeks the token written at the top position.

But you question is reasonable and warning should be reported here to avoid confusion (unreachable token). We have an issue on this: https://github.com/antlr/antlr4/issues/1072

KvanTTT avatar Oct 17 '23 13:10 KvanTTT

Am i stupid or something? Even when putting ABSTRACT before IDENTIFIER i still get the same error.

Image

grammar test;

class: ABSTRACT 'class' IDENTIFIER;

ABSTRACT: 'abstract';
IDENTIFIER: [A-Za-z_][A-Za-z0-9_]*;


// Ignore
WS: [ \r\n\t]+ -> skip;

I've been trying to use antlr4 for several days and always run into this issue. Is there something I'm missing?

mcquenji avatar May 24 '25 13:05 mcquenji

Even when putting ABSTRACT before IDENTIFIER i still get the same error.

... grammar test;

You are using lab.antlr.org with a combined grammar in which you overwrote the sample split grammar. Split grammars have separate grammars for lexer and parser. Combined grammars have only one grammar (i.e. not "parser grammar testParser; ..." and "lexer grammar testLexer; ...". In the upper left of the UI go thru each grammar by selecting the tab above the window for the grammar. Erase the lexer grammar and make sure you have only a parser grammar. Then try again.

This happens a lot. The UI should have a checkbox to switch between the two and nuke the lexer tab if combined. Or it should be adaptive in presenting one or two tabs depending on the grammar declaration.

kaby76 avatar May 24 '25 14:05 kaby76

jesus christ thanks!!! this was driving me nuts... cant believe i missed that 😅

mcquenji avatar May 24 '25 14:05 mcquenji