lark icon indicating copy to clipboard operation
lark copied to clipboard

Inline rules expand ambiguities unnecessarily

Open yurymann opened this issue 1 year ago • 4 comments

Thanks for fixing #1214 earlier. It now seems to return all ambiguities, but ambiguities inside inline rules are expanded within the parser rather than leaving it to CollapseAmbiguities transformer.

I have some input strings that explode very quickly when expanding ambiguities. I parse them with ambiguity=explicit, then count the expected number of expanded ambiguities. If it's within a reasonable threshold, I put the tree through CollapseAmbituities. Otherwise, the processing stops (until I find a better way to deal with such strings). As the result, I still can't use inline rules because parsing of some inputs explodes before I can even count the number of ambiguities.

To reproduce, parse this string:

t = parser.parse('1 2 3')
print(t.pretty())

with this grammar

start: field+
field: f1 | f2

f1: INT
f2: INT

%ignore WS
%import common.WS
%import common.INT

Result:

start
  _ambig
    field
      f1	1
    field
      f2	1
  _ambig
    field
      f1	2
    field
      f2	2
  _ambig
    field
      f1	3
    field
      f2	3

Now, inline field:

start: _field+
_field: f1 | f2

f1: INT
f2: INT

%ignore WS
%import common.WS
%import common.INT

The parser now expands the nested ambiguities prematurely:

_ambig
  start
    f1	1
    f1	2
    f1	3
  start
    f1	1
    f1	2
    f2	3
  start
    f1	1
    f2	2
    f1	3
  start
    f1	1
    f2	2
    f2	3
  start
    f2	1
    f1	2
    f1	3
  start
    f2	1
    f1	2
    f2	3
  start
    f2	1
    f2	2
    f1	3
  start
    f2	1
    f2	2
    f2	3

yurymann avatar Nov 28 '22 23:11 yurymann