transformer
transformer copied to clipboard
'gbk' codec can't decode byte 0x93 in position 978: illegal multibyte sequence and then a bytes-like object is required, not 'str'
Hi, when I first run this code,
File "D:/transformer/prepro.py", line 37, in
UnicodeDecodeError: 'gbk' codec can't decode byte 0x93 in position 978: illegal multibyte sequence
After I change this row into
_prepro = lambda x: [line.strip() for line in open(x, 'rb).read().split("\n")
if not line.startswith("<")]
a bytes-like object is required, not 'str'.
So what kind of way should I use to open this file? Look forward to reply.
adding encoding='utf-8' in open function when you open file
adding encoding='utf-8' in open function when you open file
NB!