Eric Brine

Results 39 comments of Eric Brine

PCRE is the name of a library, and not one used by Oniguruma, no. On Wed., Jun. 22, 2022, 6:31 p.m. sus-impost0r, ***@***.***> wrote: > I believe Oniguruma supports PCRE?...

(And neither uses the same regex language as Perl.)

This is a violation of Unicode. > A conformant process must not interpret illegal or ill-formed byte sequences as characters > A sequence such as 110xxxxx[b2] 0xxxxxxx[b2] is ill-formed and...

> UTF-8 errors are maintained and UTF-16 errors are converted into replacement characters. `jq` should not behave differently based on the encoding of the input.

On Tue., Jan. 11, 2022, 4:35 p.m. Maxdamantus, ***@***.***> wrote: > > jq should not behave differently based on the encoding of the input. > > I'm not sure I...

> My implementation does not interpret illegal or ill-formed byte sequences as characters. Yes it does cause this to happen. It could be a string with an invalid byte, which...

> The paragraph goes on to demonstrate concatenation of ill-formed UTF-16 strings to create a well-formed UTF-16 string Yes, but that isn't relevant. At issue is the production of invalid...

> > I did not look at the code. Your the one who said it behaved differently in > > the passage I quoted ("UTF-8 errors are maintained and UTF-16...

> Invalid UTF-16 strings are only produced from already invalid UTF-16 strings Yes, I know. Your code only generates an invalid output string when given an invalid input string. That's...

> How can a buffer (a Unicode string) contain ill-formed Unicode without being produced? The buffer in question doesn't contain Unicode. It contains bytes received. That's the whole point. And...