a-PyTorch-Tutorial-to-Image-Captioning
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard
High loss and low bleu-4 for training
When I train a new model in flickr8k and flickr30k dataset in my environment, I find that the trianing loss is too high(about 10) and the bleu-4 is too low(about 2.4e-232) after 20 epochs. It is also very strange that the parameter epochs since last improvement is 20. I didn't change the train.py code except some small bugs. How can I improve it? Is anyone having the same problem? THANKS!!!
What exactly have you changed in the code?
Be wary of erasing things like
global best_bleu4, epochs_since_improvement, checkpoint, start_epoch, fine_tune_encoder, data_name, word_map
PEP will mark those as warnings, but here they they have a good use.
I just change the code "scores, _ = pack_padded_sequence(scores, decode _lengths, batch_first = True)" to "scores = pack_padded_sequence(scores, decode _lengths, batch_first = True).data " to debug. I also change some data parameters in the begin of train.py but I don't think it would influence a lot. I didn't change the global parameters code. Do you know how to make the loss convergence? Should I lower the learning rate?
Have you tried this fix instead?
Have you tried this fix instead?
Yeah, I just delete the '_', but the cross entrypy loss must accept two tensor parameters. So I add the '.data' to the end of this code.
That's true. They should be the same in the loss by using .data
. Curios, is your loss just not decreasing, or is it getting worse?
That's true. They should be the same in the loss by using
.data
. Curios, is your loss just not decreasing, or is it getting worse?
My trian.py works, but the loss just not decreases.
I change the code "scores = pack_padded_sequence(scores, decode_lengths, batch_first=True)[0]", Because he required these in the new version. After I finished these, I didn't encounter your situation