I have a problem with pretrain model :
Using backend: pytorch
2021-05-17 04:28:43,255 INFO : Pytorch 1.8.1+cu101
2021-05-17 04:28:43,256 INFO : [INFO] Create Vocab, vocab path is /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/vocab
2021-05-17 04:28:43,310 INFO : [INFO] max_size of vocab was specified as 50000; we now have 50000 words. Stopping reading.
2021-05-17 04:28:43,310 INFO : [INFO] Finished constructing vocabulary of 50000 total words. Last word added: medicated
2021-05-17 04:28:43,459 INFO : [INFO] Loading external word embedding...
2021-05-17 04:29:32,127 INFO : [INFO] External Word Embedding iov count: 48908, oov count: 1092
2021-05-17 04:29:32,288 INFO : Namespace(atten_dropout_prob=0.1, batch_size=32, bidirectional=True, blocking=False, cache_dir='/content/drive/MyDrive/HeterSumGraph/cache/MultiNews', cuda=True, data_dir='/content/drive/MyDrive/HeterSumGraph/cache/multinews', doc_max_timesteps=50, embed_train=False, embedding_path='/content/drive/MyDrive/HeterSumGraph/glove.42B.300d.txt', feat_embed_size=50, ffn_dropout_prob=0.1, ffn_inner_hidden_size=512, gcn_hidden_size=64, gpu='0', hidden_size=128, limited=False, log_root='/content/drive/MyDrive/HeterSumGraph/log', lstm_hidden_state=64, lstm_layers=2, m=3, model='HSG', n_feature_size=64, n_head=16, n_iter=1, n_layers=1, recurrent_dropout_prob=0.1, save_label=False, save_root='/content/drive/MyDrive/HeterSumGraph/model', sent_max_len=100, test_model='evalmultinews.ckpt', use_orthnormal_init=True, use_pyrouge=True, vocab_size=50000, word_emb_dim=300, word_embedding=True)
2021-05-17 04:29:32,411 INFO : [MODEL] HeterSumGraph
2021-05-17 04:29:32,411 INFO : [INFO] Start reading ExampleSet
2021-05-17 04:29:32,591 INFO : [INFO] Finish reading ExampleSet. Total time is 0.179303, Total size is 5622
2021-05-17 04:29:32,591 INFO : [INFO] Loading filter word File /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/filter_word.txt
2021-05-17 04:29:32,692 INFO : [INFO] Loading word2sent TFIDF file from /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/test.w2s.tfidf.jsonl!
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 32 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
2021-05-17 04:29:36,417 INFO : [INFO] Use cuda
2021-05-17 04:29:36,418 INFO : [INFO] Decoding...
2021-05-17 04:29:36,419 INFO : [INFO] Restoring evalmultinews.ckpt for testing...The path is /content/drive/MyDrive/HeterSumGraph/model/eval/multinews.ckpt
Traceback (most recent call last):
File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 239, in
main()
File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 236, in main
run_test(model, dataset, loader, hps.test_model, hps)
File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 77, in run_test
model = load_test_model(model, model_name, eval_dir, hps.save_root)
File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 57, in load_test_model
model.load_state_dict(torch.load(bestmodel_load_path))
File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1224, in load_state_dict
self.class.name, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for HSumGraph:
Missing key(s) in state_dict: "word2sent.layer.heads.8.fc.weight", "word2sent.layer.heads.8.feat_fc.weight", "word2sent.layer.heads.8.attn_fc.weight", "word2sent.layer.heads.9.fc.weight", "word2sent.layer.heads.9.feat_fc.weight", "word2sent.layer.heads.9.attn_fc.weight", "word2sent.layer.heads.10.fc.weight", "word2sent.layer.heads.10.feat_fc.weight", "word2sent.layer.heads.10.attn_fc.weight", "word2sent.layer.heads.11.fc.weight", "word2sent.layer.heads.11.feat_fc.weight", "word2sent.layer.heads.11.attn_fc.weight", "word2sent.layer.heads.12.fc.weight", "word2sent.layer.heads.12.feat_fc.weight", "word2sent.layer.heads.12.attn_fc.weight", "word2sent.layer.heads.13.fc.weight", "word2sent.layer.heads.13.feat_fc.weight", "word2sent.layer.heads.13.attn_fc.weight", "word2sent.layer.heads.14.fc.weight", "word2sent.layer.heads.14.feat_fc.weight", "word2sent.layer.heads.14.attn_fc.weight", "word2sent.layer.heads.15.fc.weight", "word2sent.layer.heads.15.feat_fc.weight", "word2sent.layer.heads.15.attn_fc.weight".
Unexpected key(s) in state_dict: "dn_feature_proj.weight".
size mismatch for cnn_proj.weight: copying a param with shape torch.Size([128, 300]) from checkpoint, the shape in current model is torch.Size([64, 300]).
size mismatch for cnn_proj.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for lstm.weight_ih_l0: copying a param with shape torch.Size([512, 300]) from checkpoint, the shape in current model is torch.Size([256, 300]).
size mismatch for lstm.weight_hh_l0: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]).
size mismatch for lstm.bias_ih_l0: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.bias_hh_l0: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.weight_ih_l0_reverse: copying a param with shape torch.Size([512, 300]) from checkpoint, the shape in current model is torch.Size([256, 300]).
size mismatch for lstm.weight_hh_l0_reverse: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]).
size mismatch for lstm.bias_ih_l0_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.bias_hh_l0_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.weight_ih_l1: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
size mismatch for lstm.weight_hh_l1: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]).
size mismatch for lstm.bias_ih_l1: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.bias_hh_l1: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.weight_ih_l1_reverse: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]).
size mismatch for lstm.weight_hh_l1_reverse: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]).
size mismatch for lstm.bias_ih_l1_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm.bias_hh_l1_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]).
size mismatch for lstm_proj.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 128]).
size mismatch for lstm_proj.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for n_feature_proj.weight: copying a param with shape torch.Size([64, 256]) from checkpoint, the shape in current model is torch.Size([128, 128]).
size mismatch for word2sent.ffn.w_1.weight: copying a param with shape torch.Size([512, 64, 1]) from checkpoint, the shape in current model is torch.Size([512, 128, 1]).
size mismatch for word2sent.ffn.w_2.weight: copying a param with shape torch.Size([64, 512, 1]) from checkpoint, the shape in current model is torch.Size([128, 512, 1]).
size mismatch for word2sent.ffn.w_2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for word2sent.ffn.layer_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for word2sent.ffn.layer_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]).
size mismatch for sent2word.layer.heads.0.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
size mismatch for sent2word.layer.heads.1.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
size mismatch for sent2word.layer.heads.2.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
size mismatch for sent2word.layer.heads.3.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
size mismatch for sent2word.layer.heads.4.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
size mismatch for sent2word.layer.heads.5.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).
can anyone help me with this problem ?