lark
lark copied to clipboard
Add scanning
An implement of Lark.scan
. Also adds start_pos
and end_pos
to Lark.parse
, Lark.parse_interactive
and Lark.lex
.
TODO:
- [x] add example
- [x] A bit more documentation for what exactly this function does
- [x] Notes about
start_pos
andend_pos
mirroring the behavior of stdlibre
with regard to look behind and look ahead.
But I do think the core logic is pretty stable and I would like a review of that already @erezsh.
Future work:
- Check if it already works/What needs to be done to make this work with
mmap
to not have to load the text into memory at all (also involves checking up on the byte parsing implementation) - Check to see if I can implement a custom lexer that uses python's stdlib
tokenize
module, which would have a few benefits especially with regard to the new f string syntax, and how well that would play with this feature.
This PR is based on #1428, so merging it first would be better.