Deep-Residual-Network-For-MXNet icon indicating copy to clipboard operation
Deep-Residual-Network-For-MXNet copied to clipboard

is the bias term in the conv layers supposed to be disabled as the original paper?

Open horserma opened this issue 9 years ago • 4 comments

I saw in the original model, they disable the bias in the conv layers and add a bias in the scale layers. Since in mxnet, the batch norm layers have both the scale and the bias, I am wondering if it would make a difference without disabling the bias term.

horserma avatar Feb 25 '16 22:02 horserma

thanks for your point of view

freesouls avatar Feb 26 '16 05:02 freesouls

I have no idea about what is "scale layers" mentioned as @horserma , may you explain it or recommend some material? Thanks!

xzqjack avatar Apr 06 '16 12:04 xzqjack

@LaoAnchor In the original Residual network paper, they use a batch normalization layer followed by a scale layer. Different from the original "batch normalization" paper, their norm layer only subtracts the mean and divides the variance, whereas the standard batch norm layer should have a scale and a bias term after the normalization. In the residual network paper, they use two layers, batch norm layer and scale layer, to work as a standard norm layer. Hope it helpful.

horserma avatar Apr 06 '16 12:04 horserma

@horserma Thanks, i got that!

xzqjack avatar Apr 13 '16 03:04 xzqjack