Bert-Chinese-Text-Classification-Pytorch
Bert-Chinese-Text-Classification-Pytorch copied to clipboard
RuntimeError: Error(s) in loading state_dict for BertModel
您好! 最近用您的pytorch框架加载了很多的预训练模型fintune自己的任务都成功了,但是在用albert族的model的时候却都没成功。 报错如下:
$ python run.py --model albert_base_bright
Loading data...
401it [00:04, 96.21it/s]
140it [00:01, 101.19it/s]
135it [00:01, 86.25it/s]
Time usage: 0:00:07
Traceback (most recent call last):
File "run.py", line 39, in <module>
model = x.Model(config).to(config.device)
File "F:\PycharmProjects\Bert-Chinese-Text-Classification-Pytorch-master\models\albert_base_bright.py", line 40, in __init__
self.bert = BertModel.from_pretrained(config.bert_path,config=model_config)
File "D:\anaconda3\lib\site-packages\pytorch_transformers\modeling_utils.py", line 594, in from_pretrained
model.__class__.__name__, "\n\t".join(error_msgs)))
**RuntimeError: Error(s) in loading state_dict for BertModel:
size mismatch for bert.embeddings.word_embeddings.weight: copying a param with shape torch.Size([21128, 128]) from checkpoint, the shape in current model is torch.Size([21128, 768]).**
其中albert_base_bright的的config.json如下:
{
"attention_probs_dropout_prob": 0.0,
"directionality": "bidi",
"hidden_act": "gelu",
"hidden_dropout_prob": 0.0,
"hidden_size": 768,
"embedding_size": 128,
"initializer_range": 0.02,
"intermediate_size": 3072 ,
"max_position_embeddings": 512,
"num_attention_heads": 12,
"num_hidden_layers": 12,
"pooler_fc_size": 768,
"pooler_num_attention_heads": 12,
"pooler_num_fc_layers": 3,
"pooler_size_per_head": 128,
"pooler_type": "first_token_transform",
"type_vocab_size": 2,
"vocab_size": 21128,
"ln_type":"postln"
}
albert族的model只有albert_xxlarge_zh能够成功,albert_xxlarge_zh的json配置文件如下:
{
"attention_probs_dropout_prob": 0,
"hidden_act": "relu",
"hidden_dropout_prob": 0,
"embedding_size": 128,
"hidden_size": 4096,
"initializer_range": 0.01,
"intermediate_size": 16384,
"max_position_embeddings": 512,
"num_attention_heads": 16,
"num_hidden_layers": 12,
"num_hidden_groups": 1,
"net_structure_type": 0,
"layers_to_keep": [],
"gap_size": 0,
"num_memory_blocks": 0,
"inner_group_num": 1,
"down_scale_factor": 1,
"type_vocab_size": 2,
"vocab_size": 21128
}
我在github上找到了这个 issues 于是我用了HuggingFace的pytorch_transfomers来加载模型:
from pytorch_transformers import BertModel, BertConfig,BertTokenizer
class Model(nn.Module):
def __init__(self, config):
super(Model, self).__init__()
model_config = BertConfig.from_json_file(os.path.join(config.bert_path,'config.json'))
self.bert = BertModel.from_pretrained(config.bert_path,config=model_config)
for param in self.bert.parameters():
param.requires_grad = True
self.fc = nn.Linear(config.hidden_size, config.num_classes)
def forward(self, x):
context = x[0] # 输入的句子
mask = x[2] # 对padding部分进行mask,和句子一个size,padding部分用0表示,如:[1, 1, 1, 1, 0, 0]
_, pooled = self.bert(context, attention_mask=mask, output_all_encoded_layers=False)
out = self.fc(pooled)
return out
请问您遇到过这种报错么?这种情况是什么原因呢?是要用convert_to_pytorch那几个文件转换一下吗? 望您有时间能够抽空回复,多谢!
你好,我今天遇到了同样情况,想用albert来跑,把models/albert.py中的self.hidden_size改成312,便可适配albert-tiny顺利跑通。