just-ask icon indicating copy to clipboard operation
just-ask copied to clipboard

Overfitting in finetuning

Open kawtherOuenniche opened this issue 2 years ago • 3 comments

Hello, I am trying to use your pretrained model and reproduce the results on MSVD-QA. I'm following the same hyperparameters you mentioned in the paper and use the ckpt_pt_howtovqa69m file to initiate the model. However, I observed an overfitting starting from the early epochs (I obtained 73.97% accuracy on the training set and 41.79% on the validation set). I have also tried to use the fine-tuned model on MSVD-QA to see what happens if I retrain it on the same dataset and I obtained a decrease in the performances (I obtained 30% after 20 epochs then it saturates)! I tried to search for your loss and accuracy curves but could not find them. Would it be possible to share them here? Did you obtain similar results and if so do you know the origin of this problem? Thank you for your response.

kawtherOuenniche avatar May 24 '22 14:05 kawtherOuenniche

Hi, I also observed overfitting during VideoQA finetuning. I kept logs for the model pretrained on HowToVQA69M + WebVidVQA3M which achieves 47.47% test accuracy on MSVD-QA after finetuning. When training with an initial learning rate of 1e-5, this model reaches its best validation accuracy after the fifth epoch (45.10%), while its training accuracy is about 60%, and the training accuracy keeps increasing while the validation accuracy saturates in the remaining epochs.

antoyang avatar May 24 '22 14:05 antoyang

Thank you for the quick reply. Can you share the log to compare the results?

kawtherOuenniche avatar May 24 '22 15:05 kawtherOuenniche

There it is! stdout.log

antoyang avatar May 24 '22 20:05 antoyang