design-patterns-for-parser-combinators
design-patterns-for-parser-combinators copied to clipboard
Left-Recursive Expressions
Parser combinator libraries, like our very own miniparsec
, are often vulnerable to left-recursion:
Expressions with left recursion cannot be encoded by recursive descent parsers and will diverge.
Let's suppose we had the parser:
bad = bad *> char 'a' <|> char 'b'
This could be interpreted as matching a regex like ba*
, but using a recursive descent algorithm it will evaluate bad
by immediately evaluating... bad
. Oops! In order for parsers to be productive, they have to have consumed some input before they recurse again.
Commonly, the grammar can be factored to remove left-recursion, but this exposes implementation details and complicates the grammar, so ideally we could avoid that... I'll have a think.
Most parser combinator libraries support some form of "chain" combinators that abstract away the repeated application of a parser to operators. They handle the recursion properly, and apply the operators with the correct associativity.
I should add these to the library at some point...