insightface icon indicating copy to clipboard operation
insightface copied to clipboard

Loss could not converge when training from scratch

Open anti-machinee opened this issue 2 years ago • 4 comments

My experiment is lauched with information

  • GPU: 2
  • Model: IResNet50
  • Pretrained: No
  • Marginal softmax: ArcFace (s=30, m=0.5), do not use partial_fc
  • Batchsize : 64*2=128
  • Warm up: 0
  • Optimizer: AdamW
  • Init learning rate: 1e-1
  • Loss function: cross entropy
  • Dataset: MS1Mv2 But I am facing to some weird cases
  • Model could not converge and loss function is about 16 in many epochs (It does not like log of Insightface repo, loss seems converge so fast) Thank you for your attention

anti-machinee avatar May 27 '22 04:05 anti-machinee

learning rate should be set lower than 1e-3 when using AdamW

anxiangsir avatar May 27 '22 12:05 anxiangsir

Hi @anti-machinee. my dataset is different but I am facing a similar issue my loss is also around 16. Have you solved this issue?

Akshaysharma29 avatar Aug 13 '22 03:08 Akshaysharma29

Hi @anti-machinee I am facing a similar issue my loss is also around 16. Have you solved this issue?

You should try to decrease lr to 1e-3 or even 1e-4

anti-machinee avatar Aug 13 '22 03:08 anti-machinee

I have tried(1e-4, 1e-5, 1e-6) with adam but my loss didn't decrease.

Akshaysharma29 avatar Aug 13 '22 03:08 Akshaysharma29