Allow pest to match against byte literals
For example, if we would like to match against an occurrence of one or more actual bytes (like 7c or FFFF) we should be able to. This would take a bit of design work, however, since we make assumptions in many places in the code that we'll be dealing with UTF-8 strings.
Request from @Restioson
The redesign should tackle the issue of capturing strings from the input with the generated Spans. Currently, these guarantee UTF-8 cheap captures.
Yes hello, thanks. For more info: my usecase is matching actual bytes (not literals) for capturing values in AML.
@Restioson, do you need this feature soon? Going through the design work to add this feature will probably take some time. Probably some post 2.0 launch.
The big issue here is that Position and Span only work on UTF-8 borders. The types guarantee this. In order for byte parsing to work, one needs to either use completely different types, (imagine 2.x release), or rebase the current types to handle both cases somehow. (3.0+)
TBH we're probably going to do it with a handwritten parser @IsaacWoods wrote a while back (but nevertheless I really like this library and would like to keep helping). So, no timeframe really.
bump, any progress on this? i might be needing this soon (am parsing IMAP with pest) so i'll probably implement a proof-of-concept in a bit
Can you put it on the agenda?
@12089897411 I posted it as one of ideas here: https://github.com/pest-parser/pest/discussions/885#discussioncomment-6449851 feel free to upvote or comment on it. It won't likely be an initial priority in pest3, but once the pest3 codebase settles, it'll be more open to experiment with changes in that regard