python-soundfile
python-soundfile copied to clipboard
Test Unicode and bytes handling (Python 2 and 3) in all string arguments
trafficstars
After merging #119, the file argument should support str and unicode in Python 2 and str and bytes in Python 3. The arguments mode/format/subtype/endian should support str and unicode in Python 2 and only str in Python 3 (bytes should be disallowed there).
There are some facts that are especially annoying when testing this:
- in Python2,
unicodecan be implicitly converted/compared tostr(as long as the string consists of only ASCII characters), this is not possible for Python3'sstrandbytes. That means that test cases that pass in Python2 may fail in Python3. - file names should be tested with both Unicode and byte strings. A
bytesobject may also contain non-ASCII characters. All combinations of Unicode/bytes and ASCII/non-ASCII should be tested. - not only the success cases but also the expected failures should be tested.
- an (invalid) file extension may contain non-ASCII characters (but should still lead to a reasonable error message
- If local files, the actual file system encoding is unknown, it may be hard to test
sys.getfilesystemencoding(). - as always,
'RAW'files are special, so separate test cases have to be constructed for them.
I repeat my recommendation here: Anyone who wants to know about the pitfalls of handling Unicode should watch this: http://nedbatchelder.com/text/unipain.html