mxnet-SSH
mxnet-SSH copied to clipboard
Some parameters are confused for training wider face
Thanks for your code, but I am confused about some parameters during training wider face. In the code,
- end_epoch=10000, lr_steps=[55,68,80]; I am not sure it is correct or not for training wider face, because it means the last 9920 epochs are using lr of 0.0.000004?
- opt = optimizer.SGD(learning_rate=lr, momentum=0.9, wd=0.0005, rescale_grad=1.0/len(ctx), clip_gradient=None), the rescale_grad should be 1.0/len(ctx)/batch_size?
- During training, it fixed parameters of conv1, conv2, conv2, upsampling, it is correct?
- How long you train the wider face dataset? how many gpus? what is the dataset? Any advice will be appreciated. thanks.
I see the parameter 'color_jitter' for data augmentation, it is not used. correct?
It will end at epoch-80. Softmax already did per-instance normalization. Fixed parameters are correct. Color_jitter is required.
@nttstar Thanks for your reply. This information is important to me. Thanks. Other questions:
- I found that the program runs fast. And when I run the code with not good cpu and gpu, the gpu utility is always near to 100%. It is very good, can you explain it?
- Did you use focal loss to train the wider face? How is the result? This is an excellent work, thank you very much.