lark
lark copied to clipboard
Inline rules expand ambiguities unnecessarily
Thanks for fixing #1214 earlier. It now seems to return all ambiguities, but ambiguities inside inline rules are expanded within the parser rather than leaving it to CollapseAmbiguities
transformer.
I have some input strings that explode very quickly when expanding ambiguities. I parse them with ambiguity=explicit
, then count the expected number of expanded ambiguities. If it's within a reasonable threshold, I put the tree through CollapseAmbituities
. Otherwise, the processing stops (until I find a better way to deal with such strings). As the result, I still can't use inline rules because parsing of some inputs explodes before I can even count the number of ambiguities.
To reproduce, parse this string:
t = parser.parse('1 2 3')
print(t.pretty())
with this grammar
start: field+
field: f1 | f2
f1: INT
f2: INT
%ignore WS
%import common.WS
%import common.INT
Result:
start
_ambig
field
f1 1
field
f2 1
_ambig
field
f1 2
field
f2 2
_ambig
field
f1 3
field
f2 3
Now, inline field
:
start: _field+
_field: f1 | f2
f1: INT
f2: INT
%ignore WS
%import common.WS
%import common.INT
The parser now expands the nested ambiguities prematurely:
_ambig
start
f1 1
f1 2
f1 3
start
f1 1
f1 2
f2 3
start
f1 1
f2 2
f1 3
start
f1 1
f2 2
f2 3
start
f2 1
f1 2
f1 3
start
f2 1
f1 2
f2 3
start
f2 1
f2 2
f1 3
start
f2 1
f2 2
f2 3