TransVG icon indicating copy to clipboard operation
TransVG copied to clipboard

Question about Training from scratch.

Open toke1220 opened this issue 3 years ago • 3 comments

Hello, @djiajunustc Thank you for your excellent work! I do TransVG training in a GPU(TITan V 12G). I trained both models at the same time. One is the source code in this repository, unmodified. The other is to make some personal improvements. BUT, for these two models, when it was 30 epochs, loss fluctuated around 0.65 with no obvious convergence trend, and val_acc was about 69%. At the time of 40 epochs, both models began to show significant increase in loss and decrease in val_acc. I wonder if the difference between you training with 8 GPUs and i training with one GPU will cause this problem. In addition, could you please provide your log file so that I can refer to it for my work? This is my email [email protected]. Thank you again for your work!

toke1220 avatar Dec 24 '21 02:12 toke1220

Did you set smaller learning rate? Since you train the model on one GPU, the batchsize is only 1/8 of the original. Thus, the original learning is too large for your training setting.

jianghaojun avatar Mar 25 '22 02:03 jianghaojun

Did you set smaller learning rate? Since you train the model on one GPU, the batchsize is only 1/8 of the original. Thus, the original learning is too large for your training setting.

Thanks for your suggestion. I will try it.

toke1220 avatar May 07 '22 11:05 toke1220

@toke1220 Did lowering the learning rate solve your problem?

preetom-saha-arko avatar Jun 28 '23 07:06 preetom-saha-arko