deep-head-pose
deep-head-pose copied to clipboard
rationale behind learning rate
i was able to train your model on my own machine and get robust webcam estimations. i wonder what is the rationale behind disabling training (setting learning rate to 0) for the first conv and bn of your resnet backbone, and giving a 5x learning rate to the three fc layers? also, would you suggest more epoches for a smaller model? i need to make this work on peripheral device for work.
thanks very much. great work.