design-patterns-for-parser-combinators icon indicating copy to clipboard operation
design-patterns-for-parser-combinators copied to clipboard

Left-Recursive Expressions

Open j-mie6 opened this issue 3 years ago • 1 comments

Parser combinator libraries, like our very own miniparsec, are often vulnerable to left-recursion:

Expressions with left recursion cannot be encoded by recursive descent parsers and will diverge.

Let's suppose we had the parser:

bad = bad *> char 'a' <|> char 'b'

This could be interpreted as matching a regex like ba*, but using a recursive descent algorithm it will evaluate bad by immediately evaluating... bad. Oops! In order for parsers to be productive, they have to have consumed some input before they recurse again.

Commonly, the grammar can be factored to remove left-recursion, but this exposes implementation details and complicates the grammar, so ideally we could avoid that... I'll have a think.

j-mie6 avatar Aug 23 '21 14:08 j-mie6

Most parser combinator libraries support some form of "chain" combinators that abstract away the repeated application of a parser to operators. They handle the recursion properly, and apply the operators with the correct associativity.

I should add these to the library at some point...

j-mie6 avatar Aug 25 '21 19:08 j-mie6