dart-petitparser icon indicating copy to clipboard operation
dart-petitparser copied to clipboard

ReferenceParser prevents repeatString optimization

Open amake opened this issue 2 years ago • 1 comments

Consider this grammar:

class MyGrammar extends GrammarDefinition {
  @override
  Parser start() => ref0(myChars).starString().end();

  Parser<String> myChars() => anyOf('abc');
}

The linter will produce a warning to use one of the *String parsers, despite this already being the case:

expect(linter(MyGrammar().build()), isEmpty);

Output:

  Expected: empty
    Actual: [
              LinterIssue:LinterIssue(type: LinterType.warning, title: Character repeater, parser: Instance of 'FlattenParser<List<String>>', description: A flattened repeater (Instance of 'PossessiveRepeatingParser<String>'[0..*]) that delegates to a character parser (Instance of 'SingleCharacterParser'[any of " \t" expected]) can be much more efficiently implemented using `starString`, `plusString`, `timesString`, or `repeatString` that directly returns the underlying String instead of an intermediate List.)
            ]

I think that the cause is that the class of myChars is obscured by the wrapping ReferenceParser so that the type checks in repeatString fail and we end up on this line: https://github.com/petitparser/dart-petitparser/blob/d9b113329d454e76912cf6c78214eabfca7ec828/lib/src/parser/repeater/character.dart#L74

This seems like a difficult problem: the class of the referent is not known until build time, but repeatString needs to be called earlier.

Is the only solution here "don't use a reference"?

(Bigger question: Should references only be used to break cycles? I was under the impression that it was good practice to use them by default.)

amake avatar Aug 28 '23 00:08 amake

Good observation and analysis. This is indeed a tricky problem that might cause subtle differences in a few other places too, such as when flattening of | / .or(Parser) and & / .seq(Parser) operators.

Another workaround could be to call optimize on the built parser:

expect(linter(optimize(MyGrammar().build())), isEmpty);

Regarding references: This is really up to you. The idea was that if you always use ref you can forget about cycles, but I have written grammars where ref0 was only used between productions and tokens were stored in (constant) variables.

renggli avatar Aug 28 '23 05:08 renggli