a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard
High loss and low bleu-4 for training
When I train a new model in flickr8k and flickr30k dataset in my environment, I find that the trianing loss is too high(about 10) and the bleu-4 is too low(about 2.4e-232) after 20 epochs. It is also very strange that the parameter epochs since last improvement is 20. I didn't change the train.py code except some small bugs. How can I improve it? Is anyone having the same problem? THANKS!!!

What exactly have you changed in the code?
Be wary of erasing things like
global best_bleu4, epochs_since_improvement, checkpoint, start_epoch, fine_tune_encoder, data_name, word_map
PEP will mark those as warnings, but here they they have a good use.
I just change the code "scores, _ = pack_padded_sequence(scores, decode _lengths, batch_first = True)" to "scores = pack_padded_sequence(scores, decode _lengths, batch_first = True).data " to debug. I also change some data parameters in the begin of train.py but I don't think it would influence a lot. I didn't change the global parameters code. Do you know how to make the loss convergence? Should I lower the learning rate?
Have you tried this fix instead?
Have you tried this fix instead?
Yeah, I just delete the '_', but the cross entrypy loss must accept two tensor parameters. So I add the '.data' to the end of this code.
That's true. They should be the same in the loss by using .data. Curios, is your loss just not decreasing, or is it getting worse?
That's true. They should be the same in the loss by using
.data. Curios, is your loss just not decreasing, or is it getting worse?
My trian.py works, but the loss just not decreases.
I change the code "scores = pack_padded_sequence(scores, decode_lengths, batch_first=True)[0]", Because he required these in the new version. After I finished these, I didn't encounter your situation