a-PyTorch-Tutorial-to-Image-Captioning icon indicating copy to clipboard operation
a-PyTorch-Tutorial-to-Image-Captioning copied to clipboard

High loss and low bleu-4 for training

Open loserlulin9 opened this issue 2 years ago • 7 comments

When I train a new model in flickr8k and flickr30k dataset in my environment, I find that the trianing loss is too high(about 10) and the bleu-4 is too low(about 2.4e-232) after 20 epochs. It is also very strange that the parameter epochs since last improvement is 20. I didn't change the train.py code except some small bugs. How can I improve it? Is anyone having the same problem? THANKS!!! image

loserlulin9 avatar Feb 16 '23 09:02 loserlulin9

What exactly have you changed in the code?

Be wary of erasing things like

global best_bleu4, epochs_since_improvement, checkpoint, start_epoch, fine_tune_encoder, data_name, word_map

PEP will mark those as warnings, but here they they have a good use.

AndreiMoraru123 avatar Feb 16 '23 09:02 AndreiMoraru123

I just change the code "scores, _ = pack_padded_sequence(scores, decode _lengths, batch_first = True)" to "scores = pack_padded_sequence(scores, decode _lengths, batch_first = True).data " to debug. I also change some data parameters in the begin of train.py but I don't think it would influence a lot. I didn't change the global parameters code. Do you know how to make the loss convergence? Should I lower the learning rate?

loserlulin9 avatar Feb 16 '23 10:02 loserlulin9

Have you tried this fix instead?

AndreiMoraru123 avatar Feb 16 '23 10:02 AndreiMoraru123

Have you tried this fix instead?

Yeah, I just delete the '_', but the cross entrypy loss must accept two tensor parameters. So I add the '.data' to the end of this code.

loserlulin9 avatar Feb 16 '23 10:02 loserlulin9

That's true. They should be the same in the loss by using .data. Curios, is your loss just not decreasing, or is it getting worse?

AndreiMoraru123 avatar Feb 16 '23 11:02 AndreiMoraru123

That's true. They should be the same in the loss by using .data. Curios, is your loss just not decreasing, or is it getting worse?

My trian.py works, but the loss just not decreases.

loserlulin9 avatar Feb 16 '23 11:02 loserlulin9

I change the code "scores = pack_padded_sequence(scores, decode_lengths, batch_first=True)[0]", Because he required these in the new version. After I finished these, I didn't encounter your situation

Kevinskt avatar Feb 29 '24 08:02 Kevinskt