ace2005chinese_preprocess icon indicating copy to clipboard operation
ace2005chinese_preprocess copied to clipboard

UnicodeDecodeError

Open wangbofan opened this issue 1 year ago • 0 comments

File "D:\project\ace2005chinese_preprocess\main.py", line 3, in File "D:\project\ace2005chinese_preprocess\main.py", line 296, in preprocessing parser = Parser(path=file) File "D:\project\ace2005chinese_preprocess\main.py", line 22, in init self.sents_with_pos = self.parse_sgm(path + '.sgm') File "D:\project\ace2005chinese_preprocess\main.py", line 95, in parse_sgm soup = BeautifulSoup(f.read(), features='html.parser') UnicodeDecodeError: 'gbk' codec can't decode byte 0x88 in position 168: illegal multibyte sequence

wangbofan avatar Mar 07 '23 02:03 wangbofan