Deep-Residual-Network-For-MXNet is the bias term in the conv layers supposed to be disabled as the original paper?

is the bias term in the conv layers supposed to be disabled as the original paper?

Open horserma opened this issue 9 years ago • 4 comments

I saw in the original model, they disable the bias in the conv layers and add a bias in the scale layers. Since in mxnet, the batch norm layers have both the scale and the bias, I am wondering if it would make a difference without disabling the bias term.

Feb 25 '16 22:02 horserma

thanks for your point of view

Feb 26 '16 05:02 freesouls

I have no idea about what is "scale layers" mentioned as @horserma , may you explain it or recommend some material? Thanks!

Apr 06 '16 12:04 xzqjack

@LaoAnchor In the original Residual network paper, they use a batch normalization layer followed by a scale layer. Different from the original "batch normalization" paper, their norm layer only subtracts the mean and divides the variance, whereas the standard batch norm layer should have a scale and a bias term after the normalization. In the residual network paper, they use two layers, batch norm layer and scale layer, to work as a standard norm layer. Hope it helpful.

Apr 06 '16 12:04 horserma

@horserma Thanks, i got that!

Apr 13 '16 03:04 xzqjack

Deep-Residual-Network-For-MXNet Deep-Residual-Network-For-MXNet copied to clipboard

is the bias term in the conv layers supposed to be disabled as the original paper?

Deep-Residual-Network-For-MXNet
Deep-Residual-Network-For-MXNet copied to clipboard