neural-template-gen icon indicating copy to clipboard operation
neural-template-gen copied to clipboard

Saved state size differences

Open owanr opened this issue 4 years ago • 2 comments

I ran the following python chsmm.py -data data/labee2e/ -emb_size 300 -hid_size 300 -layers 1 -dropout 0.3 -K 60 -L 4 -log_interval 100 -thresh 9 -lr 0.5 -sep_attn -unif_lenps -emb_drop -mlpinp -onmt_decay -one_rnn -max_pool -gen_from_fi data/labee2e/src_uniq_valid.txt -load models/e2e-60-1-far.pt -tagged_fi segs/seg-e2e-60-1-far.txt -beamsz 5 -ntemplates 100 -gen_wts '1,1' -cuda -min_gen_tokes 0

and got the following error Traceback (most recent call last): File "chsmm.py", line 974, in <module> net.load_state_dict(saved_state, strict=False) File "/home/user/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for HSMM: size mismatch for lut.weight: copying a param with shape torch.Size([872, 300]) from checkpoint, the shape in current model is torch.Size([3018, 300]). size mismatch for decoder.weight: copying a param with shape torch.Size([781, 600]) from checkpoint, the shape in current model is torch.Size([2954, 600]). size mismatch for decoder.bias: copying a param with shape torch.Size([781]) from checkpoint, the shape in current model is torch.Size([2954]).

It looks like the saved_state dictionary is coming from the e2e-60-1-far.pt file I downloaded from this Github page, so I'm not quite sure what I'm doing wrong. I'm also using the provided data and segs files.

owanr avatar Jun 07 '20 04:06 owanr

Hi,

Is it possible you're not using python 2.7 and pytorch 0.3.1?

swiseman avatar Jun 09 '20 01:06 swiseman

Yes! Sorry for not including that earlier. I'm trying to run the program in Python 3.6 and Pytorch 1.5 since I want to try combining it with another project which requires Python 3.5 and above. I just added parentheses for the print statements and modified the loops using iteritems() and it looked like it was working fine, but I can't figure out what is causing the error copied above. Do you have an idea of what it might be?

Also, this is a different request, but would you happen to have time to fill out the remaining help descriptions in the parser section of chsmm.py (lines 862+)? I'm having a little trouble understanding the roles of some of the undefined parameters since I'm still learning how a HSMM decoder works.

owanr avatar Jun 09 '20 05:06 owanr