XSum
XSum copied to clipboard
Some issue with generation
Hey,
I was trying to run Topic-ConvS2s' generation.py on only the test data. I preprocessed only the test data following the guideline in XSum_Dataset readme with the pretrained LDA model you provided there. However, when I do the generation I got some errors.
Here is the command line I used:
CUDA_VISIBLE_DEVICES=1 python XSum-Topic-ConvS2S/generate.py ./data-topic-convs2s \
> --path ./topic-convs2s-emnlp18/checkpoints-topic-convs2s/checkpoint_best.pt \
> --batch-size 1 \
> --beam 10 \
> --replace-unk \
> --source-lang document \
> --target-lang summary \
> --doctopics doc-topics \
> --encoder-embed-dim 512 > test-output-topic-convs2s-checkpoint-best.pt
And here is the error message:
Traceback (most recent call last):
File "XSum-Topic-ConvS2S/generate.py", line 166, in <module>
main(args)
File "XSum-Topic-ConvS2S/generate.py", line 43, in main
models, _ = utils.load_ensemble_for_inference(args.path, dataset.src_dict, dataset.dst_dict)
File "XSum-Topic-ConvS2S/fairseq/utils.py", line 146, in load_ensemble_for_inference
model.load_state_dict(state['model'])
File "XSum-Topic-ConvS2S/fairseq/models/fairseq_model.py", line 69, in load_state_dict
super().load_state_dict(state_dict, strict)
File "../anaconda3/envs/thisenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 847, in load_state_dict
self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for FConvModel:
size mismatch for encoder.embed_tokens.weight: copying a param with shape torch.Size([50004, 512]) from checkpoint, the shape in current model is torch.Size([4, 512]).
size mismatch for decoder.embed_tokens.weight: copying a param with shape torch.Size([50004, 512]) from checkpoint, the shape in current model is torch.Size([4, 512]).
size mismatch for decoder.fc3.bias: copying a param with shape torch.Size([50004]) from checkpoint, the shape in current model is torch.Size([4]).
size mismatch for decoder.fc3.weight_g: copying a param with shape torch.Size([50004, 1]) from checkpoint, the shape in current model is torch.Size([4, 1]).
size mismatch for decoder.fc3.weight_v: copying a param with shape torch.Size([50004, 256]) from checkpoint, the shape in current model is torch.Size([4, 256]).
Do you know what possibly the error can be? Thanks in advance