trojans-face-recognizer Wrong Implementation of the Stochastic Depth ?

Wrong Implementation of the Stochastic Depth ?

Open Bear-kai opened this issue 5 years ago • 1 comments

The authors try to achieve Stochastic Depth by simply set "item.weight.requires_grad" to True/False. However, there are two issues:

If a block is drop out during one iteration, all of its params, not only the conv_weight, should be fixed.
It actually can not fix the params by simply set requires_grad=False, since the params and their momentum still exist in the optimizer (params_group&state). That is to say, even the grad is zero, the params will also update due to the nonzero momentum.

The problems should be solved through the optimizer. (Except for that the authors intended to do so. )

Oct 18 '19 03:10 Bear-kai

Thanks for your feedback. We conduct experiments with two implementation manners including (fix all parameters in the dropped path and mask this path by multiplying 0). We test these different models on our benchmarks and there are no obvious differences between them.

Nov 06 '19 07:11 jihaonew

trojans-face-recognizer trojans-face-recognizer copied to clipboard

Wrong Implementation of the Stochastic Depth ?

trojans-face-recognizer
trojans-face-recognizer copied to clipboard