Limnoria icon indicating copy to clipboard operation
Limnoria copied to clipboard

utils.str: Handle \0 in perlReToReplacer

Open tatokis opened this issue 1 year ago • 3 comments

It is required to use \g<0> as otherwise Python will process \0 as a 0 byte.

Previous behaviour:

<User> @re "s/hello/\0 world/" "hello"
<Bot> '\x00 world'

tatokis avatar May 12 '23 12:05 tatokis

I did some testing and it seems that \0 is not a particularly standard construct among regex implementations. Some will treat it as NUL, some will treat it as the a backreference for the whole string, while others will see it as an error entirely: https://regex101.com/r/SCfTW4/1

I think in these cases it's better to preserve the Python behaviour to avoid introducing inconsistencies, and use \g<0> explicitly in your regexps as needed.

<jlu5_> re "s/o/\0\0/" "hello"
-bitmonster- 'hell\x00\x00'
<jlu5_> re "s/o/\g<0>\g<0>/" "hello"
-bitmonster- helloo

Interestingly it seems perl also treats \0 as NUL:

$ perl -p -e 's/(o)/\0/g' <<< foobar | hexdump -c
0000000   f  \0  \0   b   a   r  \n                                    
0000007

jlu5 avatar May 14 '23 18:05 jlu5

Some will treat it as NUL

this one doesn't really make sense on IRC

progval avatar May 14 '23 18:05 progval

Some will treat it as NUL

this one doesn't really make sense on IRC

Given the nested commands support though, not all command output is necessarily made to be displayed.

jlu5 avatar May 14 '23 18:05 jlu5