CSharp and Java producing different error messages for expected tokens
For:
- Grammar is grammars-v4/csharp.
- Antlr4.10.1 Java and CSharp targets.
- Uses the standard console error listener.
- You can use trgen to generate for either CSharp or Java target on Windows or Linux.
Output:
- We're getting different lookahead sets for what to expect next for "mismatched input ':' expecting" between Java and CSharp.
- See https://github.com/antlr/grammars-v4/issues/2612#issuecomment-1120256697
Expected output:
- There should not be any diff in the error message between targets.
This needs to be solved since the build compares outputs--and soon the .tree files--across targets.
This could be solved by converting the set to a naturally ordered list.
This could be solved by converting the set to a naturally ordered list.
I think there's more to it because the LA sets are not the same at all between CSharp and Java targets.
# remastered for the CSharp target, testing the Java target.
diff --git a/csharp/examples/issue-2612.txt.errors b/csharp/examples/issue-2612.txt.errors
index 3c03c4a0..6832b6ca 100644
--- a/csharp/examples/issue-2612.txt.errors
+++ b/csharp/examples/issue-2612.txt.errors
@@ -1 +1 @@
-line 4:17 mismatched input ':' expecting ';'
+line 4:17 mismatched input ':' expecting {'add', 'alias', '__arglist', 'ascending', 'async', 'await', 'by', 'descending', 'dynamic', 'equals', 'from', 'get', 'group', 'into', 'join', 'let', 'nameof', 'on', 'orderby', 'partial', 'remove', 'select', 'set', 'unmanaged', 'var', 'when', 'where', 'yield', IDENTIFIER, '[', '*', '?'}
Test failed.
That bug is strange because many tests in the test suite check for errors... Can you add it to the 'ParserErrors' test suite ? Then we can check it against all targets ?
This is an old bug, but I am working out why there are differences because someone is trying to use grammars-v4/csharp (CSharp target) in StackOverflow (https://stackoverflow.com/questions/78424669/linking-visitmember-access-and-visitifstatement-methods-of-antlr4#comment138303496_78424669). (I created the desc.xml to describe what ports work for a grammar, and to try to prevent people from using bad ports. CSharp is not listed, but people ignore that anyway.)
The CSharp code and Java code error recovery are not similar.
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/CSharp/src/DefaultErrorStrategy.cs#L573-L574
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/Java/src/org/antlr/v4/runtime/DefaultErrorStrategy.java#L480-L488
"nextTokensContext" is not null, so Java does something completely different from CSharp.
In fact, the code is not similar across targets.
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/Cpp/runtime/src/DefaultErrorStrategy.cpp#L209-L210
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/Dart/lib/src/error/src/error_strategy.dart#L546-L558
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/Go/antlr/v4/error_strategy.go#L364-L365
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/JavaScript/src/antlr4/error/DefaultErrorStrategy.js#L424-L425
https://github.com/antlr/antlr4/blob/380ce4b8b1658df16ada45e1d56d5aa476052376/runtime/Python3/src/antlr4/error/ErrorStrategy.py#L406-L407
It seems the change in Java was for https://github.com/antlr/antlr4/issues/1922 https://github.com/antlr/antlr4/commit/0803c74eb255fe5e6fe5a6f11d879198d305e715 but it wasn't applied across targets and they were available. https://github.com/antlr/antlr4/tree/0803c74eb255fe5e6fe5a6f11d879198d305e715/runtime . Sloppy coding.