markdown-pp icon indicating copy to clipboard operation
markdown-pp copied to clipboard

UnicodeDecodeError: 'gbk' codec can't decode byte (problem with Chinese characters)

Open kmcbest opened this issue 4 years ago • 3 comments

Test example:

example.zip

If I use markdown-pp index.mdpp -o out.md on these two files, markdown-pp throws this error:

UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 10: illegal multibyte sequence

append "-e latexrender" doesn't work for this case.

kmcbest avatar Jul 08 '20 15:07 kmcbest

What version of Python are you using? markdown-pp only supports unicode documents in Python 3.

amyreese avatar Jul 08 '20 18:07 amyreese

What version of Python are you using? markdown-pp only supports unicode documents in Python 3.

Python 3.8.2, my example files are in UTF-8 without BOM.

kmcbest avatar Jul 09 '20 00:07 kmcbest

The project tries to read files with the default encoding used by Python. If your system uses a locale that specifies encodings other than UTF-8, then it's going to fail on decoding the contents of a UTF-8 file. You can override the system locale by specifying the appropriate environment values, and you can test the default encoding with locale.getpreferredencoding().

amyreese avatar Jul 25 '20 20:07 amyreese