a-PyTorch-Tutorial-to-Image-Captioning icon indicating copy to clipboard operation
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

The questions about training

Open song-heng opened this issue 6 years ago • 11 comments

Hi, thanks for your nice work, which helps me a lot. I have a question: I train this project in two computers, with two 1080ti and one 2080 respectively. at beam sizes 5, I get the BLEU-4 30.01 and 29.14 respectively, which is far from your 33.17. However, I compute the BLEU-4 scores with your uploaded best checkpoint, and I got your BLUE-4 score. I compare the file size of your best checkpoint with mine, and find that your file is more than 600MB, but the file I train is only 397MB and 416MB respectively. Can you tell me how to get your training results? Best Wishes

song-heng avatar Mar 30 '19 03:03 song-heng

Hi, after the performance peaked, I fine-tuned the Encoder for two or three epochs, and the model improved. I stopped after this, but I could have fine-tuned it more, I think.

I'm guessing you did not fine-tune the Encoder?

sgrvinod avatar Mar 30 '19 03:03 sgrvinod

You can do this by setting checkpoint to your best checkpoint, and fine_tune_encoder to True, and train for a few more epochs. You should see it improve.

sgrvinod avatar Mar 30 '19 03:03 sgrvinod

You can do this by setting checkpoint to your best checkpoint, and fine_tune_encoder to True, and train for a few more epochs. You should see it improve.

You were very kind in helping me, thanks for your guidance. Instead of the way you fine-tuned the Encoder for two or three epochs, it took me two days to fine-tuned the Encoder by the way of Early stopping with BLEU-4, and the newest BLEU-4 score is 31.51, which is still less than your 33.17. In addition, the best BLEU score with the teacher Forcing when training is 24.61, which is little higher than your 24.29. Did I over-fit the model? Best Wishes

song-heng avatar Apr 01 '19 14:04 song-heng

Hello, Can I ask how long you trained to get this result?

Wangzhen-kris avatar Apr 03 '19 01:04 Wangzhen-kris

Did you happen to have a chance to read the a-PyTorch-Tutorial-to-Image-Captioning#remarks section? This has been answered there. If you're however asking for a time estimate, then it depends on the capacity of compute available at your disposal.

kmario23 avatar Apr 03 '19 09:04 kmario23

@song-heng Hi, can you tell me how to test the checkpoint upload by @sgrvinod ? I test it on the TEST split, but got 0.00 bleu score... I have change the checkpoint and word map file path in eval.py. Is there anything else needed to be modified?

Btw, why is the model size differ between ours and the author's? I have the same question....

guantinglin avatar Apr 16 '19 09:04 guantinglin

Hi, thanks for your nice work, which helps me a lot. I have a question: I train this project in two computers, with two 1080ti and one 2080 respectively. at beam sizes 5, I get the BLEU-4 30.01 and 29.14 respectively, which is far from your 33.17. However, I compute the BLEU-4 scores with your uploaded best checkpoint, and I got your BLUE-4 score. I compare the file size of your best checkpoint with mine, and find that your file is more than 600MB, but the file I train is only 397MB and 416MB respectively. Can you tell me how to get your training results? Best Wishes

hi,I have a question for you. I wanted to use two 1080ts to train this model by torch.nn.DataParallel(). but i melt a error about "RuntimeError: Gather got an input of invalid size: got [32, 7, 15013], but expected [32, 6, 15013] (gather at /pytorch/torch/csrc/cuda/comm.cpp:239". How do you use two GPUs?

lzy119 avatar Sep 25 '19 01:09 lzy119

I solved the BLEU score 0, by pointing to the correct Test dataset. I had used the COCO Wordmap but using the Flickr8K Test Data. Pointing that to the correct wordmap solved it.

adib0073 avatar Apr 21 '21 04:04 adib0073

@sgrvinod hello, I have a question why Test BLEU is higher than Validation BLEU?

feixiangqiqi avatar Apr 14 '22 08:04 feixiangqiqi

@sgrvinod hello, I have a question why Test BLEU is higher than Validation BLEU?

It depends on several factors such as your model really generalizes better, beam size, test set size, etc. To make your results more reliable, you should have several runs and then average the scores instead of relying on a point estimate.

kmario23 avatar Apr 19 '22 12:04 kmario23

@kmario23 Thank you for your answering! I have another question that why we can't fine-tune from the beginning, the effect is the same as a few epochs of training without fine-tune and then fine tuning? Maybe it will affect the training of decoder?

feixiangqiqi avatar Apr 20 '22 06:04 feixiangqiqi