rust-peg icon indicating copy to clipboard operation
rust-peg copied to clipboard

grammar for `[u8]` support bytes literal

Open A4-Tacks opened this issue 1 year ago • 4 comments

peg::parser!(grammar g() for [u8] {
    rule x() = b"\xff" // invalid UTF-8, can not use str
});

b"\xff" is not valid UTF-8

A4-Tacks avatar Jul 29 '24 00:07 A4-Tacks

I am trying to skip over a UTF-8 BOM:

0xEF 0xBB 0xBF

Ralle avatar Oct 01 '25 17:10 Ralle

I am trying to skip over a UTF-8 BOM:

Using ['\xEF'] ['\xBB'] ['\xBF'] instead of "\xEF\xBB\xBF"

A4-Tacks avatar Oct 02 '25 03:10 A4-Tacks

Neither '\xEF' nor "\xEF\xBB\xBF" are valid literals in Rust. A BOM can be written as "\u{FEFF}" though, and in rust-peg that will match that multi-byte sequence in either a str or [u8].

Original feature request still stands for matching non-UTF sequences in [u8], but in this case it is a valid Unicode code point.

kevinmehall avatar Oct 02 '25 14:10 kevinmehall

Neither '\xEF' nor "\xEF\xBB\xBF" are valid literals in Rust.

Oh, b'\xEF' is valid literals

A4-Tacks avatar Oct 02 '25 14:10 A4-Tacks