PeppaPEG icon indicating copy to clipboard operation
PeppaPEG copied to clipboard

Precedence Climbing

Open soasme opened this issue 2 years ago • 3 comments

Is your feature request related to a problem? Please describe. See #137.

Many programming languages have a deep precedence levels, say Java has twentyish. If we use left recursion and right recursion, the performance would be so bad. Instead, many parsers such as GCC,Clang choose to implement precedence climbing.

Describe the solution you'd like

  1. New expression kind: P4_PrecedenceClimb.
  2. Syntax: S = atom < op1 < op2 < op3 < op4;. From a parsing's perspective, it's similar with S = atom ((op1 / op2 / op3 / op4) atom)*, but the construction of AST is based on the precedence level.
  3. New expression flag: P4_RightAssociativity. When not specified, it's left associativity by default.
  4. Syntax: @right_associativity pow = "^";. It allows operators using right associativity. For example, 1^2^3 should be parsed as 1^(2^3), while 1*2*3 should be parsed as (1*2)*3.

Additional context

  • https://ycpcs.github.io/cs340-fall2018/lectures/lecture06.html
  • https://www.oilshell.org/blog/2016/11/01.html
  • https://eli.thegreenplace.net/2012/08/02/parsing-expressions-by-precedence-climbing
  • https://docs.rs/peg/0.7.0/peg/#precedence-climbing

soasme avatar Sep 21 '21 08:09 soasme

https://github.com/yhirose/cpp-peglib

Just a reference, this cpp-peglib project has the precedence climbing support, but it is C++.

asmwarrior avatar Oct 31 '21 08:10 asmwarrior

Also for performance reasons having a character class that include ranges seems to cut a lot recursive calls, or maybe an optimization pass that would compress choices of characters to one character class.

char = [\x20-\x21] / [\x23-\x5b] / [\x5d-\U0010ffff] / (
    "\\" (
        "\""
        / "/"
        / "\\"
        / "b"
        / "f"
        / "n"
        / "r"
        / "t"
        / "v"
        / ("x" ~ two_hexdigits)
        / ("u" ~ four_hexdigits)
        / ("U" ~ eight_hexdigits)
    )
);

Becomes:

char = [\x20-\x21\x23-\x5b\x5d-\U0010ffff"/\\\b\f\n\r\t\v];

From pegjs:

CharacterClassMatcher
  = "[" "^"? (ClassCharacterRange / ClassCharacter)* "]" "i"? ;

ClassCharacterRange
  = ClassCharacter "-" ClassCharacter ;

mingodad avatar Dec 18 '21 10:12 mingodad

After writing the previous message I decided to make a test to see if PeppaPEG checks for duplication in choices and it seems that it doesn't, both versions of the rule bellow are parsed without any warning/error.

SingleEscapeCharacter
  = "'"
  / "\""
  / "\\"
  / "b"
  / "f"
  / "n"
  / "r"
  / "t"
  / "v" ;

With repetitions:

SingleEscapeCharacter
  = "'"
  / "\""
  / "\\"
  / "b"
  / "f"
  / "n"
  / "n" #duplicated
  / "n" #duplicated
  / "n" #duplicated
  / "r"
  / "t"
  / "v" ;

mingodad avatar Dec 18 '21 10:12 mingodad