insightface Loss could not converge when training from scratch

Loss could not converge when training from scratch

Open anti-machinee opened this issue 3 years ago • 4 comments

My experiment is lauched with information

GPU: 2
Model: IResNet50
Pretrained: No
Marginal softmax: ArcFace (s=30, m=0.5), do not use partial_fc
Batchsize : 64*2=128
Warm up: 0
Optimizer: AdamW
Init learning rate: 1e-1
Loss function: cross entropy
Dataset: MS1Mv2 But I am facing to some weird cases
Model could not converge and loss function is about 16 in many epochs (It does not like log of Insightface repo, loss seems converge so fast) Thank you for your attention

May 27 '22 04:05 anti-machinee

learning rate should be set lower than 1e-3 when using AdamW

May 27 '22 12:05 anxiangsir

Hi @anti-machinee. my dataset is different but I am facing a similar issue my loss is also around 16. Have you solved this issue?

Aug 13 '22 03:08 Akshaysharma29

Hi @anti-machinee I am facing a similar issue my loss is also around 16. Have you solved this issue?

You should try to decrease lr to 1e-3 or even 1e-4

Aug 13 '22 03:08 anti-machinee

I have tried(1e-4, 1e-5, 1e-6) with adam but my loss didn't decrease.

Aug 13 '22 03:08 Akshaysharma29