Pytorch_Retinaface icon indicating copy to clipboard operation
Pytorch_Retinaface copied to clipboard

training strategy

Open goodgoodstudy92 opened this issue 5 years ago • 1 comments
trafficstars

Hi Thanks for your amazing project, I'm confused about the learning rate setting. I saw in other issue you mentioned "in order to detect the key points accurately, it is necessary to have a large learning rate for the key points", is that mean the learning rate in lmk branch should be larger than classification and bbox regression? But in your code I found the learning rate is the same in the whole model

def adjust_learning_rate(optimizer, gamma, epoch, step_index, iteration, epoch_size):
   warmup_epoch = -1
   if epoch <= warmup_epoch:
       lr = 1e-6 + (initial_lr-1e-6) * iteration / (epoch_size * warmup_epoch)
   else:
       lr = initial_lr * (gamma ** (step_index))
   for param_group in optimizer.param_groups:
       param_group['lr'] = lr
   return lr

and in the Retinaface's mxnet version, I found the author use smaller learning rate in lmk branch

if landmark:
      landmark_diff = rpn_landmark_pred_reshape-landmark_target
      landmark_diff = landmark_diff * landmark_weight
      rpn_landmark_loss_ = mx.symbol.smooth_l1(name='%s_rpn_landmark_loss_stride%d_'%(prefix,stride), scalar=3.0, data=landmark_diff)
      rpn_landmark_loss = mx.sym.MakeLoss(name='%s_rpn_landmark_loss_stride%d'%(prefix,stride), data=rpn_landmark_loss_, grad_scale=0.5*lr_mult / (config.TRAIN.RPN_BATCH_SIZE))
      ret_group.append(rpn_landmark_loss)
      ret_group.append(mx.sym.BlockGrad(landmark_weight))
    return ret_group

goodgoodstudy92 avatar Jan 02 '20 08:01 goodgoodstudy92

Hello,if I change the batchsize,should I also change the lr?

S201961152 avatar Dec 21 '20 01:12 S201961152