PaddleHub
PaddleHub copied to clipboard
NLP 中文本分类替换原文档中使用的模型出现的问题
根据文档进行文本分类,https://github.com/PaddlePaddle/PaddleHub/tree/release/v2.1/demo/text_classification
环境:python3.7;win10;paddlepaddle=2.2.2;paddlehub=2.2.0
在原文档中使用的是name=ernie_tiny模型,想要替换为name=ernie,版本为2.0.0
直接修改为:model = hub.Module(name='ernie', version='2.0.0', task='seq-cls')
结果出现以下报错:
[2022-04-02 10:13:23,557] [ WARNING] - An error was encountered while loading ernie. Detailed error information can be found in the C:\Users\admin.paddlehub\log\20220402.log.
[2022-04-02 10:13:23,561] [ WARNING] - An error was encountered while loading ernie. Detailed error information can be found in the C:\Users\admin.paddlehub\log\20220402.log.
Download https://bj.bcebos.com/paddlehub/paddlehub_dev/ernie_2.0.0.tar.gz
[##################################################] 100.00%
Decompress C:\Users\admin.paddlehub\tmp\tmpkq594c3d\ernie_2.0.0.tar.gz
[##################################################] 100.00%
[2022-04-02 10:13:24,844] [ INFO] - Successfully uninstalled ernie
Traceback (most recent call last):
File "train.py", line 38, in
model = hub.Module(name='ernie', version='2.0.0', task='seq-cls')#选择模型 模型下载地址C:\Users\admin.paddlenlp\models
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\module.py", line 395, in new
**kwargs)
ignore_env_mismatch=ignore_env_mismatch)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\manager.py", line 190, in install
return self._install_from_name(name, version, ignore_env_mismatch)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\manager.py", line 265, in _install_from_name
return self._install_from_url(item['url'])
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\manager.py", line 258, in _install_from_url
return self._install_from_archive(file)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\manager.py", line 380, in _install_from_archive
return self._install_from_directory(directory)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\module\manager.py", line 364, in _install_from_directory
File "
建议使用最新版的ernie模型:https://www.paddlepaddle.org.cn/hubdetail?name=ernie&en_category=SemanticModel
model = hub.Module(name='ernie', version='2.0.2', task='seq-cls')
按照这样的要求,使用最新的ernie版本,出现一下问题:
PS D:\PaddleHub\PaddleHub\demo\text_classification> python train.py
[2022-04-06 14:55:57,042] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\ernie-1.0\ernie_v1_chn_base.pdparams
W0406 14:55:57.045624 18448 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.4, Runtime API Version: 10.2
W0406 14:55:57.052621 18448 device_context.cc:465] device: 0, cuDNN Version: 7.6.
[2022-04-06 14:56:00,553] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\ernie-1.0\vocab.txt
[2022-04-06 14:56:07,959] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\ernie-1.0\vocab.txt
[2022-04-06 14:56:08,890] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\ernie-1.0\vocab.txt
[2022-04-06 14:56:09,814] [ INFO] - PaddleHub model checkpoint loaded. current_epoch=9 [acc=0.9400]
Traceback (most recent call last):
File "train.py", line 56, in
save_interval=args.save_interval,
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\finetune\trainer.py", line 213, in train
self.optimizer_step(self.current_epoch, batch_idx, self.optimizer, loss)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddlehub\finetune\trainer.py", line 395, in optimizer_step
self.optimizer.step()
File "D:\Anaconda\install\envs\hub\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\dygraph\base.py", line 296, in impl
return func(*args, **kwargs)
File "D:\Anaconda\install\envs\hub\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\framework.py", line 229, in impl
return func(*args, **kwargs)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\adam.py", line 422, in step
loss=None, startup_program=None, params_grads=params_grads)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\optimizer.py", line 891, in _apply_optimize
optimize_ops = self._create_optimization_pass(params_grads)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\adamw.py", line 372, in _create_optimization_pass
AdamW, self)._create_optimization_pass(parameters_and_grads)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\optimizer.py", line 677, in _create_optimization_pass
[p[0] for p in parameters_and_grads if not p[0].stop_gradient])
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\adam.py", line 297, in _create_accumulators
self._add_moments_pows(p)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\adam.py", line 262, in _add_moments_pows
self._add_accumulator(self._moment1_acc_str, p, dtype=acc_dtype)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\optimizer\optimizer.py", line 593, in _add_accumulator
var.set_value(self._accumulators_holder[var_name])
File "D:\Anaconda\install\envs\hub\lib\site-packages\decorator.py", line 232, in fun
return caller(func, *(extras + args), **kw)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\wrapped_decorator.py", line 25, in impl
return wrapped_func(*args, **kwargs)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\framework.py", line 229, in impl
return func(*args, **kwargs)
File "D:\Anaconda\install\envs\hub\lib\site-packages\paddle\fluid\dygraph\varbase_patch_methods.py", line 170, in set_value
self.name, self_tensor_np.shape, value_np.shape)
AssertionError: Variable Shape not match, Variable [ embedding_0.w_0_moment1_0 ] need tensor with shape (18000, 768) but load set tensor with shape (50006, 1024)
[2022-04-06 14:56:09,814] [ INFO] - PaddleHub model checkpoint loaded. current_epoch=9 [acc=0.9400]
看着报错提示的是参数shape不对应,换了模型后需要重新训练。
rtionError: Variable Shape not match, Variable [ embedding_0.w_0_moment1_0 ] need tensor with shape (18000, 768) but load set tensor with shape (50006, 1024)
听了你的建议,之前另一个模型下生成有这个checkpoint文件,重新在text_classification文件下新建一个checkpoint2文件来保存,解决问题
[2022-04-06 14:56:09,814] [ INFO] - PaddleHub model checkpoint loaded. current_epoch=9 [acc=0.9400]
在文本匹配demo中,由ernie-tiny更换为bert-base-chinese或者ernie的2.0.2模型,重新训练,出现这样的问题: PS D:\PaddleHub\PaddleHub\demo\text_matching> python train.py [2022-04-07 14:49:49,768] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\bert-base-chinese\bert-base-chinese.pdparams W0407 14:49:49.770637 18872 device_context.cc:447] Please NOTE: device: 0, GPU Compute Capability: 7.5, Driver API Version: 11.4, Runtime API Version: 10.2 W0407 14:49:49.778635 18872 device_context.cc:465] device: 0, cuDNN Version: 7.6. [2022-04-07 14:49:54,666] [ INFO] - Weights from pretrained model not used in BertModel: ['cls.predictions.decoder_weight', 'cls.predictions.decoder_bias', 'cls.predictions.transform.weight', 'cls.predictions.transform.bias', 'cl s.predictions.layer_norm.weight', 'cls.predictions.layer_norm.bias', 'cls.seq_relationship.weight', 'cls.seq_relationship.bias'] [2022-04-07 14:49:55,127] [ INFO] - Already cached C:\Users\admin.paddlenlp\models\bert-base-chinese\bert-base-chinese-vocab.txt [2022-04-07 14:50:50,766] [ WARNING] - PaddleHub model checkpoint not found, start from scratch... 结束运行 怎样解决
可以参考下这个教程:PaddleHub2.0——使用动态图版预训练模型ERNIE实现文本分类