arc4
arc4 copied to clipboard
Passing unicode string to key or data leads mysterious errors
Currently all methods including constructor defined in ARC4 accepts not only bytes but also unicode str.
However passing a unicode str to ARC4 methods requires developers to be very careful like the following example.
>>> from arc4 import ARC4
>>> with open('key-in-utf-8.txt') as f:
... arc4 = ARC4(f.read())
>>> with open('cipher-utf-8.txt') as f:
... print(arc4.decrypt(f.read())) # success
>>> with open('cipher-latin-1.txt') as f:
... print(arc4.decrypt(f.read())) # failure
Traceback (most recent call last):
File "/path/to/script.py", line 7, in <module>
print(arc4.decrypt(f.read()))
File "/path/to/python3.9/lib/python3.9/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf4 in position 3: invalid continuation byte
The example is too simple for experienced developers to make a mistake but once someone changed the text file's encoding, or the print function is called from another module, it could be difficult to find the root cause.
So I will add changes to have ARC4 accept only bytes in future releases.
As a first step, arc4 0.3.0 will warn DeprecationWarning when users pass unicode str to any public methods or constructor of ARC4.
Thereafter maybe in arc4 1.0.0, it will be officially unsupported.