lark
lark copied to clipboard
lex - When an error happens, how can I display all tokens matched so far?
When an error happens (lark.exceptions.UnexpectedCharacters), there is usually some "Previous tokens" information such as this:
Previous tokens: Token('__ANON_0', 'CL79')`
That only seems to contain the token immediately preceding the error, but not the ones before. Am I doing something wrong, or is there a way to display all tokens matched so far?
It's possible to collect all the tokens by writing a postlexer.
Another way is to parse using the interactive parser.
Mind if I ask what you need it for?
@erezsh I was trying to see what tokens were being matched so I could debug the lexer rules
Can I still use the postlexer in my case, where an exception is thrown (so the lexing process isn't yet complete)?
Yes, the postlexer gets the tokens one by one, so if you save them somewhere (like in a global list, or inside the postlexer instance), you will have the latest list.
Lark doesn't save those tokens, because we want to support memory-efficient streaming. But perhaps we could do it when debug=True
.
Lark doesn't save those tokens, because we want to support memory-efficient streaming. But perhaps we could do it when
debug=True
.
That would be wonderful for ease of development