lark
lark copied to clipboard
Found a gotcha with the interactive parser
parser = Lark(r"""
%ignore /[\t \f]+/ // WS
start: d|c
d: B
c: A B
A: "abc"
B: /[^\W\d]\w*/
""", parser="lalr")
input_str = 'abc'
interactive = parser.parse_interactive(input_str)
print(interactive.exhaust_lexer())
interactive.accepts()
The output is
[Token('A', 'abc')]
{'B'}
but if you modify the input str with a B token, then the output turns into:
[Token('B', 'abcfdsfds')]
{'$END'}
Is there a way to recognize that we need a whitespace separated B token and not just an immediate B?