DELG Question about the training parameters

Hello, @feymanpriv , thank you very much for your work.

I am trying to review your code, here I want to know some key important parameters in your work:

As claimed in delg paper, only 15M step(about 25epoch) is needed in GLDv2Clean-Train-80percent dataset, while the max epoch in your config file is 100, do you train 100 epochs finnally to get your result?

Is there any modification in your implemention which is different to origin implemention by tensorflow ?

Nov 21 '20 08:11 zjcs

Yes, we trained 100 epochs. In fact, I have tried the tf implementation, it dit not work well and the batch size could only be set to a small value and the training speed is slow. So i guess some details are different from the author. We used cosine lr in the experiment.

Nov 22 '20 16:11 feymanpriv

@feymanpriv , thank you for your reply.

Your code is very clear to read, thanks for your work. I still have some questions:

about the learning rate and batchsize: As DELG said, the recommanded hyperparameter is 8 Tesla P100 GPUs: --batch_size=256, --initial_lr=0.01 or 4 Tesla P100 GPUs: --batch_size=128, --initial_lr=0.005, while your config file is 8GPU: --batch_size=64, --initial_lr=0.01 and --batch_size should be 256 due to the implemention in loader.construct_train_loader.
can you tell me more result about the difference of train with 25epochs or 100epochs?
You said, the tf implemention donot work well, do you use the recommand parameters? and what about the result after 25epochs?
what do you mean " the batch size could only be set to a small value"?

Nov 23 '20 10:11 zjcs

@zjcs thank you for your attention

I directly use the batch size 256 and lr 0.1 in my training 2 I haven't tried 25 epochs 3 When i use the tf, i also train for 100 epochs with the setting the paper mentioned, but the result is not available to achieve the model that google team released 4 Also, in the tf training, large bs is out of memory while torch is not There still exists some questions in this, it will be helpful if you check the code and training.

Nov 24 '20 03:11 feymanpriv

@feymanpriv

Your reply is very helpful to me, thank you very much, I will try later.

Have a nice day~

Nov 24 '20 07:11 zjcs

DELG DELG copied to clipboard

Question about the training parameters

DELG
DELG copied to clipboard