lark icon indicating copy to clipboard operation
lark copied to clipboard

Pipe in terminal regex not working as expected

Open sidhiadkoli opened this issue 1 year ago • 1 comments

What is your question?

Facing an issue with a pipe in terminal regex.

Here is a subset of the grammar in question:

from lark import Lark

grammar = """
start: START

START: QUARTER [WS+ YEAR]
QUARTER: /q[1-4]/
WS: /\s/
YEAR: /(19[0-9]{2})|(20[0-3][0-9])/
"""

print(Lark(grammar).parse("q1 1923"))    # works
print(Lark(grammar).parse("q1 2023"))    # doesn't work

However, when we add parenthesis around the full YEAR regex, both the string examples get parsed correctly.

This works:

from lark import Lark

grammar = """
start: START

START: QUARTER [WS+ YEAR]
QUARTER: /q[1-4]/
WS: /\s/
YEAR: /((19[0-9]{2})|(20[0-3][0-9]))/
"""

print(Lark(grammar).parse("q1 1923"))    # works
print(Lark(grammar).parse("q1 2023"))    # works now

What am I missing here?

sidhiadkoli avatar May 14 '24 21:05 sidhiadkoli

This is a bug in the way lark combines terminals, #1415 fixes it.

MegaIng avatar May 16 '24 13:05 MegaIng