pyt icon indicating copy to clipboard operation
pyt copied to clipboard

pyt usually picks the wrong encoding to load files

Open matthewdeanmartin opened this issue 3 years ago • 1 comments

Traceback (most recent call last):
  File "c:\users\matth\appdata\local\programs\python\python38\lib\runpy.py", line 193, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "c:\users\matth\appdata\local\programs\python\python38\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\matth\.local\bin\pyt.exe\__main__.py", line 7, in <module>
  File "c:\users\matth\.local\pipx\venvs\python-taint\lib\site-packages\pyt\__main__.py", line 92, in main
    nosec_lines[path] = retrieve_nosec_lines(path)
  File "c:\users\matth\.local\pipx\venvs\python-taint\lib\site-packages\pyt\__main__.py", line 57, in retrieve_nosec_lines
    lines = file.readlines()
  File "c:\users\matth\appdata\local\programs\python\python38\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 2105: character maps to <undefined>

sometimes this helps

export PYTHONIOENCODING=utf-8
export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8

but today it didn't so I'm about ready to stop using pyt... I'm somewhat worried I've been using it wrong for a few years because of the various tools I use, pyt never complained about anything, (i.e. found no vulnerabilities or bugs, neither positive or false)

If anyone ever takes over this project, then all the file open() calls should either specify utf-8 (a better "guess") or use chardet to make a really good guess.

matthewdeanmartin avatar Jan 24 '21 14:01 matthewdeanmartin

Try doing python3 -X utf8 -m pyt

FredHappyface avatar Mar 05 '21 02:03 FredHappyface