lark icon indicating copy to clipboard operation
lark copied to clipboard

Some grammars can throw the CYK parser into an infinite loop

Open erezsh opened this issue 6 years ago • 7 comments

Not likely to happen naturally, but possible.

For example:

parser = lark.Lark("start: a\na: b\n b:c\n c:a\n | /a/")
parser.parse('a')

Throws all parsers into an infinite loop (the cyk parser already enters an infinite loop on the first line)

Possible solution: Detect infinite loops in grammar, and throw an exception.

erezsh avatar Mar 23 '18 23:03 erezsh

Hey I would like to work on this issue.

SarthakDandriyal avatar Mar 02 '19 01:03 SarthakDandriyal

Great, be my guest.

Earley doesn't enter a loop anymore in recent versions.

The only thing left to fix is the cyk parser.

erezsh avatar Mar 02 '19 09:03 erezsh

I would like to work on that can you guide me more about the cyk parser .

SarthakDandriyal avatar Mar 02 '19 09:03 SarthakDandriyal

What kind of guidance do you need?

The code of the cyk parser is here: https://github.com/lark-parser/lark/blob/master/lark/parsers/cyk.py

You should figure out which line is causing the infinite loop, and take it from there.

erezsh avatar Mar 02 '19 09:03 erezsh

Thank you I will work on it.

SarthakDandriyal avatar Mar 02 '19 09:03 SarthakDandriyal

hey @erezsh can you give me some test cases for the cyk parser.

SarthakDandriyal avatar Mar 03 '19 06:03 SarthakDandriyal

What do you mean?

Did you see this page? https://lark-parser.readthedocs.io/en/latest/how_to_develop/

erezsh avatar Mar 03 '19 09:03 erezsh