grammars-v4 icon indicating copy to clipboard operation
grammars-v4 copied to clipboard

[Dart2] Using "operator" in the field name results in a syntax error

Open yososs opened this issue 3 years ago • 17 comments

Checking the Dart2 specification, 'operator' can be used as a field name. Probably the same problem will occur with other marked keywords.

Keyword Specifications for Dart2

  • https://dart.dev/guides/language/language-tour#keywords

Avoid using these words as identifiers. However, if necessary, the keywords marked with superscripts can be identifiers:

  • Words with the superscript 1 are contextual keywords, which have meaning only in specific places. They’re valid identifiers everywhere.
  • Words with the superscript 2 are built-in identifiers. These keywords are valid identifiers in most places, but they can’t be used as class or type names, or as import prefixes.
  • Words with the superscript 3 are limited reserved words related to asynchrony support. You can’t use await or yield as an identifier in any function body marked with async, async*, or sync*. All other words in the table are reserved words, which can’t be identifiers.

Reproduced code

class A{
  bool isPlusOrMinus(Expression expression) {
    if (expression.operator == '+') return true;
    if (expression.operator == '-') return true;
    return false;
  }
}

yososs avatar May 01 '22 10:05 yososs

Then probably this:

unconditionalAssignableSelector
  : '[' expression ']'
  | '.' identifier
  ;

should become:

unconditionalAssignableSelector
  : '[' expression ']'
  | '.' identifier
  | '.' 'operator'
  ;

bkiers avatar May 01 '22 13:05 bkiers

This grammar is old. The newest grammar, maintained by Erik Ernst, is here, and appears to have fixed this issue. I recently ported the grammar here to "target-agnostic format" in response to an antlr-discussions question. I will update the grammar today.

kaby76 avatar May 02 '22 11:05 kaby76

Thanks for sharing. I will check the operation tomorrow.

yososs avatar May 02 '22 21:05 yososs

I ran the following unit test code. It works well, but I found that there are still a few problems.

  • The "new" keyword fails to test.
public class DartParserTest {

        // see: https://dart.dev/guides/language/language-tour#keywords
	@Test
	public void testKeywords0() {
		String[] keywords0 = { "assert", "break", "case", "catch", "class", "const", "continue", "default", "do", "else",
				"enum", "extends", "false", "final", "finally", "for", "if", "in", "is", 
				"new", 
				"null", "rethrow",
				"return", "super", "switch", "this", "throw", "true", "try", "var", "void", "while", "with" };

		for (String k : keywords0) {
			String content = "class A{\n"
					+ "  bool isPlusOrMinus(Expression expression) {\n"
					+ "    if (expression."+k+" == '+') return true;\n"
					+ "    if (expression."+k+" == '-') return true;\n"
					+ "    return false;\n"
					+ "  }\n"
					+ "}\n";
//			System.out.println(content);
			
			final CodePointCharStream cstream = CharStreams.fromString(content);
			final DartLexer lexer = new DartLexer(cstream);
			final CommonTokenStream stream = new CommonTokenStream(lexer);
			stream.fill();
			DartParser parser = new DartParser(stream);
			boolean[] syntaxErr = new boolean[1];
			parser.addErrorListener(new BaseErrorListener() {
				@Override
				public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
						RecognitionException arg5) {
					syntaxErr[0] = true;
				}
			});
			LibraryDefinitionContext root = parser.libraryDefinition();
			Assert.assertTrue("error in "+k, syntaxErr[0]);
		}
	}
	
	@Test
	public void testKeywords1() {
		String[] keywords1 = {"show", "async", "sync", "on", "hide"};

		for (String k : keywords1) {
			String content = "class A{\n"
					+ "  bool isPlusOrMinus(Expression expression) {\n"
					+ "    if (expression."+k+" == '+') return true;\n"
					+ "    if (expression."+k+" == '-') return true;\n"
					+ "    return false;\n"
					+ "  }\n"
					+ "}\n";
//			System.out.println(content);
			
			final CodePointCharStream cstream = CharStreams.fromString(content);
			final DartLexer lexer = new DartLexer(cstream);
			final CommonTokenStream stream = new CommonTokenStream(lexer);
			stream.fill();
			DartParser parser = new DartParser(stream);
			boolean[] syntaxErr = new boolean[1];
			parser.addErrorListener(new BaseErrorListener() {
				@Override
				public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
						RecognitionException arg5) {
					syntaxErr[0] = true;
				}
			});
			LibraryDefinitionContext root = parser.libraryDefinition();
			Assert.assertFalse("error in "+k, syntaxErr[0]);
		}
	}
	
	@Test
	public void testKeywords2() {
		String[] keywords2 = { "abstract", "as", "covariant", "deferred", "dynamic", "export", "extension", "external",
				"factory", "Function", "get", "implements", "import", "interface", "late", "library", "mixin",
				"operator", "part", "required", "set", "static", "typedef" };

		for (String k : keywords2) {
			String content = "class A{\n"
					+ "  bool isPlusOrMinus(Expression expression) {\n"
					+ "    if (expression."+k+" == '+') return true;\n"
					+ "    if (expression."+k+" == '-') return true;\n"
					+ "    return false;\n"
					+ "  }\n"
					+ "}\n";
//			System.out.println(content);
			
			final CodePointCharStream cstream = CharStreams.fromString(content);
			final DartLexer lexer = new DartLexer(cstream);
			final CommonTokenStream stream = new CommonTokenStream(lexer);
			stream.fill();
			DartParser parser = new DartParser(stream);
			boolean[] syntaxErr = new boolean[1];
			parser.addErrorListener(new BaseErrorListener() {
				@Override
				public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
						RecognitionException arg5) {
					syntaxErr[0] = true;
				}
			});
			LibraryDefinitionContext root = parser.libraryDefinition();
			Assert.assertFalse("error in "+k, syntaxErr[0]);
		}
	}

	@Test
	public void testKeywords3_async() {
		String[] keywords3 = {"await", "yield"};

		for (String k : keywords3) {
			String content = "class A{\n"
					+ "  bool isPlusOrMinus(Expression expression) async {\n"
					+ "    if (expression."+k+" == '+') return true;\n"
					+ "    if (expression."+k+" == '-') return true;\n"
					+ "    return false;\n"
					+ "  }\n"
					+ "}\n";
//			System.out.println(content);
			
			final CodePointCharStream cstream = CharStreams.fromString(content);
			final DartLexer lexer = new DartLexer(cstream);
			final CommonTokenStream stream = new CommonTokenStream(lexer);
			stream.fill();
			DartParser parser = new DartParser(stream);
			boolean[] syntaxErr = new boolean[1];
			parser.addErrorListener(new BaseErrorListener() {
				@Override
				public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
						RecognitionException arg5) {
					syntaxErr[0] = true;
				}
			});
			LibraryDefinitionContext root = parser.libraryDefinition();
			Assert.assertTrue("error in "+k, syntaxErr[0]);
		}
	}
	
	@Test
	public void testKeywords3() {
		String[] keywords3 = {"await", "yield"};

		for (String k : keywords3) {
			String content = "class A{\n"
					+ "  bool isPlusOrMinus(Expression expression) {\n"
					+ "    if (expression."+k+" == '+') return true;\n"
					+ "    if (expression."+k+" == '-') return true;\n"
					+ "    return false;\n"
					+ "  }\n"
					+ "}\n";
//			System.out.println(content);
			
			final CodePointCharStream cstream = CharStreams.fromString(content);
			final DartLexer lexer = new DartLexer(cstream);
			final CommonTokenStream stream = new CommonTokenStream(lexer);
			stream.fill();
			DartParser parser = new DartParser(stream);
			boolean[] syntaxErr = new boolean[1];
			parser.addErrorListener(new BaseErrorListener() {
				@Override
				public void syntaxError(Recognizer<?, ?> arg0, Object arg1, int arg2, int arg3, String arg4,
						RecognitionException arg5) {
					syntaxErr[0] = true;
				}
			});
			LibraryDefinitionContext root = parser.libraryDefinition();
			Assert.assertFalse("error in "+k, syntaxErr[0]);
		}
	}
}

yososs avatar May 03 '22 14:05 yososs

  • I scraped the grammar from the .tex before, and saved the results and scraper here. I'll bring the scraper up to date, sequester a copy of the current version of the .tex file, re-scrape the grammar, and redo the PR in the next day or so. I didn't replace the current dart2/ grammar a year or two ago with the scraped grammar, probably because it was "not optimized". At the time, I didn't have Trash far enough along, but it is better now.

kaby76 avatar May 04 '22 10:05 kaby76

A little bit of status...

I've been updating the scraper and have a grammar that works only "so so"--but better than the other available grammars. See https://github.com/kaby76/ScrapeDartSpec/blob/master/scraped.g4

I've written a small thread describing how this compares with the current "dart2/" grammar and the "reference grammar" that was written by the Dart Language Team. https://twitter.com/KenDomino/status/1533053623554428929

There is still a lot of work to do.

kaby76 avatar Jun 04 '22 12:06 kaby76

Do you have a comparison to the antlr4 grammar found in the dart sdk?

  • https://github.com/dart-lang/sdk/blob/master/tools/spec_parser/Dart.g

yososs avatar Jun 04 '22 17:06 yososs

Yes. I ran sdk sources through the Dart grammar written by the Dart Language Team. The results are here. It didn't do as well as the scraped grammar.

kaby76 avatar Jun 04 '22 20:06 kaby76

Comparison results are good. I will actually use it too.

  • old: 56/368
  • ref: 167/368
  • new: 187/368

yososs avatar Jun 05 '22 09:06 yososs

I ran the same test using scraped.g4.

The test for testKeywords0 now passes, but the test for testKeywords3 fails.

The Dart language seems to have a complicated syntax due to the special specification of keywords.

yososs avatar Jun 05 '22 12:06 yososs

The Spec does not define rules for dynamic types. https://github.com/dart-lang/language/issues/2276. After adding in 'dynamic' as a type, the grammar accepts 78% of the Dart sdk. Much much better.

  • old: 58/372
  • ref: 171/372
  • new: 292/372

kaby76 avatar Jun 05 '22 16:06 kaby76

https://github.com/dart-lang/language/issues/2279

Now 94% of the sdk passing.

  • old: 58/372
  • ref: 171/372
  • new: 350/372

kaby76 avatar Jun 06 '22 18:06 kaby76

Another problem with the Spec, https://github.com/dart-lang/language/issues/2282, occurs with "abstract" modifiers on fields. I have a workaround, but it's a terrible hack (the old rule was this; it is now this). The grammar in the Spec doesn't even corresponding directly to the hand-written parser in the Dart compiler.

Now 95% of the sdk passing.

  • old: 58/372
  • ref: 171/372
  • new: 353/372

kaby76 avatar Jun 07 '22 16:06 kaby76

Status: I have a new grammar that passes 369 out of 372 Dart source files in the sdk. I think I'll stop here. I plan on using this as a bootstrap grammar to parse the Dart compiler and scrape the grammar directly from the sources. Although the quality of the grammar that the Dart team provides is very good, the fact that it's two years behind the source code means that it'll be always out of date. It's a similar situation for other languages. Scraping the source of the compiler is the only real solution.

kaby76 avatar Jun 10 '22 21:06 kaby76

Status

The good news: I have a new Dart2 grammar that parses 100% of the Dart2 SDK.

The grammar requires two semantic predicates in the lexer. Since I want this to work across targets, I've been working to write the grammar in "target agnostic format".

However, the split parser for C# is not working. I have done many dozens of these conversions to "target agnostic format", for all but one of the targets, so I am confident that I am doing it correctly. While the lexer tokens are the same, the parser operates differently between split vs combine.

Therefore, it is likely that I've stumbled on a bug in the parser runtime for C#. I am looking into the problem.

kaby76 avatar Jun 12 '22 15:06 kaby76

  • I rewrote the semantic predicates into Java of the (working) Dart2.g4 grammar, and it works!
  • The split grammar for C# and Java (not working) produce identical parsing errors!

The error occurs in both C# and Java for a split grammar, but not for the combined grammar for either target. This is bad. It means there is a problem across targets for split grammars--unless the combined grammar code was supposed to produce a parse error.

kaby76 avatar Jun 12 '22 20:06 kaby76

The problem was with string literals. I defined rules that should not have been there. https://github.com/antlr/grammars-v4/pull/2654 fixes #2597.

kaby76 avatar Jun 13 '22 16:06 kaby76