pest icon indicating copy to clipboard operation
pest copied to clipboard

Solve left-recursion and improve error recovery using pika parsing

Open Nadrieril opened this issue 5 years ago • 2 comments

This recent paper (https://arxiv.org/abs/2005.06444) describes an alternative technique for parsing PEG grammars: instead of top-down, left-to-right, they propose a bottom-up, right-to-left approach, that correctly handles left-recursion and apparently has also better errors. It's essentially packrat parsing but with a different approach to dynamic programming, so the semantics are identical. I know the pest team doesn't currently have a lot of resources but I thought this was a cool idea if someone else is interested. I think it would be possible to make a generic external parsing library that pest can then use instead of its own packrat parser.

Nadrieril avatar May 17 '20 19:05 Nadrieril

How about this one (https://www.sciencedirect.com/science/article/pii/S0167642314000288)? I think this one more suitable for top-down by specially handling a recursive pattern.

It uses a memoization just like the packrat parser to store the matching of minimum recursive bound (zero recurse). Then the algorithm incrementally increases the bound to make it greedy enough while still utilizing the stored value to make it faster. When it fails it stops.

The Pest PEG can be augmented with a recursive marker to tell an intentional recursive pattern. Then everything can start from there, while unmarked one still produces error.

Abdillah avatar Jul 31 '21 00:07 Abdillah

I'm trying to implement the paper (or as close as I understand), here https://github.com/pest-parser/pest/pull/533.

Abdillah avatar Aug 21 '21 13:08 Abdillah