GearNet
GearNet copied to clipboard
confusion on epochs
Hi! I am wondering the number of epochs in experiment. The epoch is set to 200 for EC stated in the paper, but in the config the epoch is set to 50. Whether should I modify the epochs to 200 for reproducing the experiment?
Thanks for your help!
Hi, thanks for the question. In the original experiment, we train all models for 200 epochs. When I organize the code, I find training for 50 epochs can yield similar performance as training for 200 epochs. So it should be fine to train for just 50 epochs.
Thank you for your reply! I have another question, when I try to reproduce the downstream tasks like EC, you mentioned in other issues that the batch size should be set to 8 using single GPU. But I found the finetuning process with low GPU-Util. If there is enough memory in GPU, can I enlarge the batch size without influencing the performance of Gearnet? Thanks for your help!
Actually, I think the downstream performance is very sensitive to batch size. For the current config, I find batch_size=8
is the best setting. Typically, if you want to change the batch_size
, you also need to change the learning rate and number of epochs to get similar performance. You can try other batch_size
to see whether similar performance can be achieved.