MobileNet-Caffe icon indicating copy to clipboard operation
MobileNet-Caffe copied to clipboard

The regularization of depthwise convolution

Open BOBrown opened this issue 6 years ago • 1 comments

The author wrote following words in paper: Additionally, we found that it was important to put very little or no weight decay (l2 regularization) on the depthwise filters since their are so few parameters in them.

Therefore, i think that we should set decay_mult: 0.0 in the moblienet prototxt

BOBrown avatar Mar 04 '18 09:03 BOBrown

Isn't this line taken from the MobilenetV1 paper? I couldn't find any such statement in the MobilenetV2 paper.

I wonder if all parameters are to be decayed in MobileNetV2 training - at-least that's the understanding that I get by looking at the repository's (very few) that provide a training script: eg: https://github.com/Randl/MobileNetV2-pytorch

mathmanu avatar Jun 05 '18 07:06 mathmanu