bert_for_corrector
bert_for_corrector copied to clipboard
文件编码有问题
你好,我试着跑了一下bert_corrector.py代码,发现文件编码的错误,具体如下:
Traceback (most recent call last):
File "D:/soft/bert_for_corrector/bert_corrector.py", line 73, in
请问一下,这个错误是因为模型文件编码导致的吗
你好,我试着跑了一下bert_corrector.py代码,发现文件编码的错误,具体如下: Traceback (most recent call last): File "D:/soft/bert_for_corrector/bert_corrector.py", line 73, in d = BertCorrector() File "D:/soft/bert_for_corrector/bert_corrector.py", line 23, in init tokenizer=bert_model_dir) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 2727, in pipeline framework = framework or get_framework(model) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 110, in get_framework model = AutoModel.from_pretrained(model) File "D:\Anaconda3\Lib\site-packages\transformers\modeling_auto.py", line 624, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "D:\Anaconda3\Lib\site-packages\transformers\configuration_auto.py", line 330, in from_pretrained config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 374, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 456, in _dict_from_json_file text = reader.read() File "D:\Anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
请问一下,这个错误是因为模型文件编码导致的吗
you can see the model file here location
Or get the models file path by reading the README file
thinks
语言模型能重新训练,或者增量训练吗
你好,我试着跑了一下bert_corrector.py代码,发现文件编码的错误,具体如下: Traceback (most recent call last): File "D:/soft/bert_for_corrector/bert_corrector.py", line 73, in d = BertCorrector() File "D:/soft/bert_for_corrector/bert_corrector.py", line 23, in init tokenizer=bert_model_dir) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 2727, in pipeline framework = framework or get_framework(model) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 110, in get_framework model = AutoModel.from_pretrained(model) File "D:\Anaconda3\Lib\site-packages\transformers\modeling_auto.py", line 624, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "D:\Anaconda3\Lib\site-packages\transformers\configuration_auto.py", line 330, in from_pretrained config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 374, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 456, in _dict_from_json_file text = reader.read() File "D:\Anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
请问一下,这个错误是因为模型文件编码导致的吗
我也遇到了这个问题 ,请问具体是怎么解决的呢
你好,我试着跑了一下bert_corrector.py代码,发现文件编码的错误,具体如下: Traceback (most recent call last): File "D:/soft/bert_for_corrector/bert_corrector.py", line 73, in d = BertCorrector() File "D:/soft/bert_for_corrector/bert_corrector.py", line 23, in init tokenizer=bert_model_dir) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 2727, in pipeline framework = framework or get_framework(model) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 110, in get_framework model = AutoModel.from_pretrained(model) File "D:\Anaconda3\Lib\site-packages\transformers\modeling_auto.py", line 624, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "D:\Anaconda3\Lib\site-packages\transformers\configuration_auto.py", line 330, in from_pretrained config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 374, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 456, in _dict_from_json_file text = reader.read() File "D:\Anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte 请问一下,这个错误是因为模型文件编码导致的吗
我也遇到了这个问题 ,请问具体是怎么解决的呢
bert_corrector.py里面初始化的时候只保留“bert_model_dir”这个变量,其他的不用
你好,我试着跑了一下bert_corrector.py代码,发现文件编码的错误,具体如下: Traceback (most recent call last): File "D:/soft/bert_for_corrector/bert_corrector.py", line 73, in d = BertCorrector() File "D:/soft/bert_for_corrector/bert_corrector.py", line 23, in init tokenizer=bert_model_dir) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 2727, in pipeline framework = framework or get_framework(model) File "D:\Anaconda3\Lib\site-packages\transformers\pipelines.py", line 110, in get_framework model = AutoModel.from_pretrained(model) File "D:\Anaconda3\Lib\site-packages\transformers\modeling_auto.py", line 624, in from_pretrained pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs File "D:\Anaconda3\Lib\site-packages\transformers\configuration_auto.py", line 330, in from_pretrained config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 374, in get_config_dict config_dict = cls._dict_from_json_file(resolved_config_file) File "D:\Anaconda3\Lib\site-packages\transformers\configuration_utils.py", line 456, in _dict_from_json_file text = reader.read() File "D:\Anaconda3\lib\codecs.py", line 322, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte 请问一下,这个错误是因为模型文件编码导致的吗
我也遇到了这个问题 ,请问具体是怎么解决的呢
bert_corrector.py里面初始化的时候只保留“bert_model_dir”这个变量,其他的不用
那模型load进去了吗
遇到了同样的问题 怎么解决的?
遇到了同样的问题 怎么解决的?
我是ubuntu的系统,可能是文件格式不一致,你可以试着改改文件格式或修改代码
语句改一下,改成self.model = pipeline('fill-mask', model=bert_model_dir, tokenizer=bert_model_dir)即可。应该是版本的问题。