mutmut icon indicating copy to clipboard operation
mutmut copied to clipboard

UnicodeDecodeError Windows Environment

Open TiagoDanielOliveira opened this issue 4 years ago • 8 comments

Hi mutant survivals,

we have facing some problems using mutmut in a windows environment. Works fine in a linux env but not in the teamsmembers with windows.

Error:

image

Do you have any tip to fix this?

TiagoDanielOliveira avatar Mar 31 '21 08:03 TiagoDanielOliveira

Looks like your code isn't valid utf8. I highly recommend fixing this.

boxed avatar Mar 31 '21 11:03 boxed

Why this ERROR does not occur in a Linux environment?

TiagoDanielOliveira avatar Mar 31 '21 11:03 TiagoDanielOliveira

Presumably because the code is utf8 there for some reason?

boxed avatar Mar 31 '21 11:03 boxed

It looks like the issue might lie with python and windows and not the code/mutmut itself

I had a similar issue where I had the code formatted in utf-8 and was using mutmut with pytest, and still had the same error

A bit of investigation showed that pytest was outputting in a cp1252, which was then raising a exception when mutmut tried to read the pytest output

I was able to solve this issue by forcing python to output to utf-8 instead of the default cp1252, which solve the issue

sys.stdout.reconfigure(encoding="utf-8")

Duarte-Figueiredo avatar May 23 '21 09:05 Duarte-Figueiredo

This problem affected my application as well. Python documentation says you can override the default locale encoding using environment variables. But that didn't work for me. What did work was invoking python in utf-8 mode.

Python versions: 3.8, 3.9 and 3.10:

python -X utf8 -m mutmut run --CI --no-progress
python -X utf8 -m mutmut html
python -X utf8 -m mutmut junitxml

According to python documentation I found, it's recommended to specify encoding when opening a file:

  • https://docs.python.org/3/library/io.html#text-encoding

WiredNerd avatar Feb 20 '23 01:02 WiredNerd

@WiredNerd and @Duarte-Figueiredo are you sure you're not talking about unicode ENCODE problems? This ticket was opened about DECODE errors. But what you are talking about is the opposite.

boxed avatar Feb 20 '23 06:02 boxed

@boxed Some of each.

Here's my example project: https://github.com/WiredNerd/mutmut-test

mutmut run error: UnicodeEncodeError: 'charmap' codec can't encode character '\U0001f389' in position 368: character maps to <undefined> mutmut html error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 11: character maps to <undefined> mutmut junitxml error: UnicodeEncodeError: 'charmap' codec can't encode characters in position 1155-1157: character maps to <undefined>

https://github.com/WiredNerd/mutmut-test/actions/runs/4223944830

WiredNerd avatar Feb 20 '23 13:02 WiredNerd

Just ran into the same problem while trying out mutmut under Windows. My Python source files are encoded in utf-8 (which now seems to be the standard also under Windows), but the default encoding used for reading and writing text files is still cp1252 under Windows - so I get the same UnicodeDecodeError for any non-ASCII strings in the test code. Changing the open calls to use encoding="utf-8" fixes this (as will probably changing the encoding application-wide). It may make sense to do this globally in mutmut, if we can assume that all Python source code is UTF-8 nowadays.

There is of course the possibility to encode the file using another encoding as defined in PEP 263, which would not be recognized by mutmut, but I'm not sure if this is really used (apart from marking a file as utf-8, as I used to do). Given that nobody has complained so far, it is probably not an issue.

mrbean-bremen avatar Jul 14 '23 19:07 mrbean-bremen