ChineseNER
ChineseNER copied to clipboard
'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte
Traceback (most recent call last):
File "E:\python2.7\pycharm\PyCharm 4.5.5\helpers\pydev\pydevd.py", line 2358, in
我的是tensorflow 1.3版本,请问下大家有没有遇到类似问题?有何解决方法。
请问你解决了吗?
还没有,悲剧。
What's wrong with it?My tensorflow is 1.4.
I found this question can be solved as below:
in utils.py change as follows:
def test_ner(results, path): """ Run perl script to evaluate model """ output_file = os.path.join(path, "ner_predict.utf8") with open(output_file, "w", encoding='utf8') as f: to_write = [] for block in results: for line in block: to_write.append(line + "\n") to_write.append("\n") f.writelines(to_write) eval_lines = return_report(output_file) return eval_lines
The reason is that only when you write the file use "utf8" can you open the file use "utf8", and it have nothing to do with the tensorflow version.
@yyHaker ,good job, it help me solved this problem,thanks
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa3 in position 0: invalid start byte
2018-02-02 12:08:27,160 - log\train.log - INFO - iteration:1 step:1000/1044, NER loss: 5.380470 2018-02-02 12:08:36,132 - log\train.log - INFO - evaluate:dev Traceback (most recent call last): 运行到这还是那个编码问题,你们遇到了吗?
This is still the encoding problem, you can debug to find the encoding problem
@yyHaker Thanks!
This is a encoding problem. If you coding in Linux ,please trans the coding by Notepad++.But ,if you coding in Windows ,Please use this : import codecs with codecs.open(filename, 'r', 'utf-8') as f: #this is your process
it is very easy. You just need to change the 'utf-8' to 'gbk' in the 'return_report' of 'utils.py'.