lark
lark copied to clipboard
Grammar compilation has combinatorial performance for ranges and optionals
This is due to the naively aggressive inlining policy of the grammar compiler.
For example, the following grammar takes a long time to compile:
start: a? b? c? d? e? f? g? h? i? j? k? l?
a: "A"
b: "B"
c: "C"
d: "D"
e: "E"
f: "F"
g: "G"
h: "H"
i: "I"
j: "J"
k: "K"
l: "L"
Similar issue with exists with ranges, such as (a~1..10)~1..10
Obviously, they can also interact, for example (a? b? c? d?)~1..100
The solution is a smarter policy, which can extract such expressions into separate rules, when the combinatorical effect passes a certain threshold.
See also: Issue #175