pyaes icon indicating copy to clipboard operation
pyaes copied to clipboard

Non-ascii (i.e. UTF-8) charcaters in strings causes problems as they are treated as bytes

Open tagesk opened this issue 4 years ago • 0 comments

In Python3, it can not be assumed that the number of bytes is equal to the number of characters. This means that .encrypt(s) returns 11 bytes and not 12 (see below). I believe the best solution is to require the input to be bytes and not string. Furthermore, the parameters to AESMode.... should both be bytes and not strings (see last example below). I have noticed it says in a parenthesis in README to pass bytes rather than strings, but it should be detected.

I need a library such as this one, and will be happy to contribute if you deem this issue to be worthy an effort. In any case, I will add code to ensure that the code will coõperate with the rest of my (multi-lingual) environment.

>>> # https://www.newyorker.com/culture/culture-desk/the-curse-of-the-diaeresis
>>> s = "coõperation"
>>> len(s)
11
>>> len(s.encode())
12
>>> len(pyaes.aes._string_to_bytes(s))
11
>>> pyaes.AESModeOfOperationOFB("This_key_for_demo_purposes_only!", iv ="InitializationVe") #From Readme
TypeError: a bytes-like object is required, not 'str'
>>> pyaes.AESModeOfOperationOFB(b"This_key_for_demo_purposes_only!", iv = "åååååååå") # 16 BYTES iv!
ValueError: initialization vector must be 16 bytes

This is the offending code (line 509 in aes.py):

    def encrypt(self, plaintext):
        encrypted = [ ]
        for p in _string_to_bytes(plaintext):
            ....

tagesk avatar Dec 09 '20 19:12 tagesk