HeterSumGraph problem with pretrain model

problem with pretrain model

Open hieunguyenquoc opened this issue 3 years ago • 1 comments

I have a problem with pretrain model : Using backend: pytorch 2021-05-17 04:28:43,255 INFO : Pytorch 1.8.1+cu101 2021-05-17 04:28:43,256 INFO : [INFO] Create Vocab, vocab path is /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/vocab 2021-05-17 04:28:43,310 INFO : [INFO] max_size of vocab was specified as 50000; we now have 50000 words. Stopping reading. 2021-05-17 04:28:43,310 INFO : [INFO] Finished constructing vocabulary of 50000 total words. Last word added: medicated 2021-05-17 04:28:43,459 INFO : [INFO] Loading external word embedding... 2021-05-17 04:29:32,127 INFO : [INFO] External Word Embedding iov count: 48908, oov count: 1092 2021-05-17 04:29:32,288 INFO : Namespace(atten_dropout_prob=0.1, batch_size=32, bidirectional=True, blocking=False, cache_dir='/content/drive/MyDrive/HeterSumGraph/cache/MultiNews', cuda=True, data_dir='/content/drive/MyDrive/HeterSumGraph/cache/multinews', doc_max_timesteps=50, embed_train=False, embedding_path='/content/drive/MyDrive/HeterSumGraph/glove.42B.300d.txt', feat_embed_size=50, ffn_dropout_prob=0.1, ffn_inner_hidden_size=512, gcn_hidden_size=64, gpu='0', hidden_size=128, limited=False, log_root='/content/drive/MyDrive/HeterSumGraph/log', lstm_hidden_state=64, lstm_layers=2, m=3, model='HSG', n_feature_size=64, n_head=16, n_iter=1, n_layers=1, recurrent_dropout_prob=0.1, save_label=False, save_root='/content/drive/MyDrive/HeterSumGraph/model', sent_max_len=100, test_model='evalmultinews.ckpt', use_orthnormal_init=True, use_pyrouge=True, vocab_size=50000, word_emb_dim=300, word_embedding=True) 2021-05-17 04:29:32,411 INFO : [MODEL] HeterSumGraph 2021-05-17 04:29:32,411 INFO : [INFO] Start reading ExampleSet 2021-05-17 04:29:32,591 INFO : [INFO] Finish reading ExampleSet. Total time is 0.179303, Total size is 5622 2021-05-17 04:29:32,591 INFO : [INFO] Loading filter word File /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/filter_word.txt 2021-05-17 04:29:32,692 INFO : [INFO] Loading word2sent TFIDF file from /content/drive/MyDrive/HeterSumGraph/cache/MultiNews/test.w2s.tfidf.jsonl! /usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:477: UserWarning: This DataLoader will create 32 worker processes in total. Our suggested max number of worker in current system is 4, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary. cpuset_checked)) 2021-05-17 04:29:36,417 INFO : [INFO] Use cuda 2021-05-17 04:29:36,418 INFO : [INFO] Decoding... 2021-05-17 04:29:36,419 INFO : [INFO] Restoring evalmultinews.ckpt for testing...The path is /content/drive/MyDrive/HeterSumGraph/model/eval/multinews.ckpt Traceback (most recent call last): File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 239, in main() File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 236, in main run_test(model, dataset, loader, hps.test_model, hps) File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 77, in run_test model = load_test_model(model, model_name, eval_dir, hps.save_root) File "/content/drive/MyDrive/HeterSumGraph/evaluation.py", line 57, in load_test_model model.load_state_dict(torch.load(bestmodel_load_path)) File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1224, in load_state_dict self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for HSumGraph: Missing key(s) in state_dict: "word2sent.layer.heads.8.fc.weight", "word2sent.layer.heads.8.feat_fc.weight", "word2sent.layer.heads.8.attn_fc.weight", "word2sent.layer.heads.9.fc.weight", "word2sent.layer.heads.9.feat_fc.weight", "word2sent.layer.heads.9.attn_fc.weight", "word2sent.layer.heads.10.fc.weight", "word2sent.layer.heads.10.feat_fc.weight", "word2sent.layer.heads.10.attn_fc.weight", "word2sent.layer.heads.11.fc.weight", "word2sent.layer.heads.11.feat_fc.weight", "word2sent.layer.heads.11.attn_fc.weight", "word2sent.layer.heads.12.fc.weight", "word2sent.layer.heads.12.feat_fc.weight", "word2sent.layer.heads.12.attn_fc.weight", "word2sent.layer.heads.13.fc.weight", "word2sent.layer.heads.13.feat_fc.weight", "word2sent.layer.heads.13.attn_fc.weight", "word2sent.layer.heads.14.fc.weight", "word2sent.layer.heads.14.feat_fc.weight", "word2sent.layer.heads.14.attn_fc.weight", "word2sent.layer.heads.15.fc.weight", "word2sent.layer.heads.15.feat_fc.weight", "word2sent.layer.heads.15.attn_fc.weight". Unexpected key(s) in state_dict: "dn_feature_proj.weight". size mismatch for cnn_proj.weight: copying a param with shape torch.Size([128, 300]) from checkpoint, the shape in current model is torch.Size([64, 300]). size mismatch for cnn_proj.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for lstm.weight_ih_l0: copying a param with shape torch.Size([512, 300]) from checkpoint, the shape in current model is torch.Size([256, 300]). size mismatch for lstm.weight_hh_l0: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]). size mismatch for lstm.bias_ih_l0: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.bias_hh_l0: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.weight_ih_l0_reverse: copying a param with shape torch.Size([512, 300]) from checkpoint, the shape in current model is torch.Size([256, 300]). size mismatch for lstm.weight_hh_l0_reverse: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]). size mismatch for lstm.bias_ih_l0_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.bias_hh_l0_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.weight_ih_l1: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]). size mismatch for lstm.weight_hh_l1: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]). size mismatch for lstm.bias_ih_l1: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.bias_hh_l1: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.weight_ih_l1_reverse: copying a param with shape torch.Size([512, 256]) from checkpoint, the shape in current model is torch.Size([256, 128]). size mismatch for lstm.weight_hh_l1_reverse: copying a param with shape torch.Size([512, 128]) from checkpoint, the shape in current model is torch.Size([256, 64]). size mismatch for lstm.bias_ih_l1_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm.bias_hh_l1_reverse: copying a param with shape torch.Size([512]) from checkpoint, the shape in current model is torch.Size([256]). size mismatch for lstm_proj.weight: copying a param with shape torch.Size([128, 256]) from checkpoint, the shape in current model is torch.Size([64, 128]). size mismatch for lstm_proj.bias: copying a param with shape torch.Size([128]) from checkpoint, the shape in current model is torch.Size([64]). size mismatch for n_feature_proj.weight: copying a param with shape torch.Size([64, 256]) from checkpoint, the shape in current model is torch.Size([128, 128]). size mismatch for word2sent.ffn.w_1.weight: copying a param with shape torch.Size([512, 64, 1]) from checkpoint, the shape in current model is torch.Size([512, 128, 1]). size mismatch for word2sent.ffn.w_2.weight: copying a param with shape torch.Size([64, 512, 1]) from checkpoint, the shape in current model is torch.Size([128, 512, 1]). size mismatch for word2sent.ffn.w_2.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for word2sent.ffn.layer_norm.weight: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for word2sent.ffn.layer_norm.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([128]). size mismatch for sent2word.layer.heads.0.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]). size mismatch for sent2word.layer.heads.1.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]). size mismatch for sent2word.layer.heads.2.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]). size mismatch for sent2word.layer.heads.3.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]). size mismatch for sent2word.layer.heads.4.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]). size mismatch for sent2word.layer.heads.5.fc.weight: copying a param with shape torch.Size([50, 64]) from checkpoint, the shape in current model is torch.Size([50, 128]).

can anyone help me with this problem ?

May 17 '21 04:05 hieunguyenquoc

Please check your hyper-parameters. It seems that you have changed several hidden size and can not load the pretrained model. If you want to load the pretrained model directly, you need to keep all hyper-parameters as the original.

Jan 10 '22 09:01 dqwang122

HeterSumGraph HeterSumGraph copied to clipboard

problem with pretrain model

HeterSumGraph
HeterSumGraph copied to clipboard