pyth
pyth copied to clipboard
Can’t safely put NUL or CR bytes inside a double-quoted string
Inside a double-quoted string, Pyth translates CR (\r
) to LF (\n
). NUL bytes (\000
) seem to work unless followed by a digit 0–7, because Pyth translates them to \0
instead of \000
.
$ printf '"\r"' | xxd
00000000: 220d 22 "."
$ printf '"\r"' | pyth -d /dev/stdin
==================== 3 chars =====================
"
"
==================================================
imp_print("\n")
==================================================
$ printf '"\00012"' | xxd
00000000: 2200 3132 22 ".12"
$ printf '"\00012"' | pyth -d /dev/stdin
==================== 5 chars =====================
"12"
==================================================
imp_print("\012")
==================================================
I've fixed the null byte issue, but the CR issue seems to be introduced by Python. I'll need to investigate more for that one.
If you replace open(file_or_string, encoding='iso-8859-1')
with open(file_or_string, encoding='iso-8859-1', newline='')
, then Python will stop translating \r
and \r\n
to \n
. Of course, you may then need to teach Pyth to keep accepting \r
and \r\n
in various other places where newlines are significant, to keep Mac and Windows users happy.
(It may be cleaner, but more work, to open in binary mode and use bytes
everywhere?)
\r
hasn't been used on Mac for a while now
There are similar issues with \
followed by NUL or LF or CR.
\␀
↦ imp_print("␀")
↦ ValueError: source code string cannot contain null bytes
\␊
or \␍
↦ IndexError: string index out of range