PaddleNLP
PaddleNLP copied to clipboard
[Bug]: UNIMO模型的resize_token_embeddings方法不会修改decoder的vocab_size,导致报错
软件环境
- paddlepaddle: 2.5.2
- paddlepaddle-gpu: 2.5.2
- paddlenlp: 2.8.0
重复问题
- [X] I have searched the existing issues
错误描述
UNIMO模型的resize_token_embeddings方法不会修改decoder的vocab_size,导致input_embeddings_size和output_embeddings_size没法对齐
稳定复现步骤 & 代码
tokenizer = UNIMOTokenizer.from_pretrained('./unimo-text-1.0-large')
model.resize_token_embeddings(len(tokenizer))
print(model.get_input_embeddings().weight.shape, model.lm_head.weight.shape)
GPT2模型也有类似问题,但是他已经被修复了,参考link,我使用类似方法修改unimo/modeling.py后可以修复,后续会提个PR。
This issue is stale because it has been open for 60 days with no activity. 当前issue 60天内无活动,被标记为stale。
This issue was closed because it has been inactive for 14 days since being marked as stale. 当前issue 被标记为stale已有14天,即将关闭。