mxnet-SSH icon indicating copy to clipboard operation
mxnet-SSH copied to clipboard

Some parameters are confused for training wider face

Open hdjsjyl opened this issue 7 years ago • 3 comments

Thanks for your code, but I am confused about some parameters during training wider face. In the code,

  1. end_epoch=10000, lr_steps=[55,68,80]; I am not sure it is correct or not for training wider face, because it means the last 9920 epochs are using lr of 0.0.000004?
  2. opt = optimizer.SGD(learning_rate=lr, momentum=0.9, wd=0.0005, rescale_grad=1.0/len(ctx), clip_gradient=None), the rescale_grad should be 1.0/len(ctx)/batch_size?
  3. During training, it fixed parameters of conv1, conv2, conv2, upsampling, it is correct?
  4. How long you train the wider face dataset? how many gpus? what is the dataset? Any advice will be appreciated. thanks.

hdjsjyl avatar Oct 16 '18 01:10 hdjsjyl

I see the parameter 'color_jitter' for data augmentation, it is not used. correct?

hdjsjyl avatar Oct 16 '18 01:10 hdjsjyl

It will end at epoch-80. Softmax already did per-instance normalization. Fixed parameters are correct. Color_jitter is required.

nttstar avatar Oct 16 '18 05:10 nttstar

@nttstar Thanks for your reply. This information is important to me. Thanks. Other questions:

  1. I found that the program runs fast. And when I run the code with not good cpu and gpu, the gpu utility is always near to 100%. It is very good, can you explain it?
  2. Did you use focal loss to train the wider face? How is the result? This is an excellent work, thank you very much.

hdjsjyl avatar Oct 17 '18 00:10 hdjsjyl