just-ask
just-ask copied to clipboard
Overfitting in finetuning
Hello, I am trying to use your pretrained model and reproduce the results on MSVD-QA. I'm following the same hyperparameters you mentioned in the paper and use the ckpt_pt_howtovqa69m file to initiate the model. However, I observed an overfitting starting from the early epochs (I obtained 73.97% accuracy on the training set and 41.79% on the validation set). I have also tried to use the fine-tuned model on MSVD-QA to see what happens if I retrain it on the same dataset and I obtained a decrease in the performances (I obtained 30% after 20 epochs then it saturates)! I tried to search for your loss and accuracy curves but could not find them. Would it be possible to share them here? Did you obtain similar results and if so do you know the origin of this problem? Thank you for your response.
Hi, I also observed overfitting during VideoQA finetuning. I kept logs for the model pretrained on HowToVQA69M + WebVidVQA3M which achieves 47.47% test accuracy on MSVD-QA after finetuning. When training with an initial learning rate of 1e-5, this model reaches its best validation accuracy after the fifth epoch (45.10%), while its training accuracy is about 60%, and the training accuracy keeps increasing while the validation accuracy saturates in the remaining epochs.
Thank you for the quick reply. Can you share the log to compare the results?
There it is! stdout.log