chainer-fast-neuralstyle
chainer-fast-neuralstyle copied to clipboard
Why FastStyleNet add w = math.sqrt(2) in ResidualBlock
Thanks for reading my problem.
When I check the FastStyleNet,I found that the Convolution2D add the w=math.sqrt(2).the code is below:
class ResidualBlock(chainer.Chain): def init(self, n_in, n_out, stride=1, ksize=3): w = math.sqrt(2) super(ResidualBlock, self).init( c1=L.Convolution2D(n_in, n_out, ksize, stride, 1, w), c2=L.Convolution2D(n_out, n_out, ksize, 1, 1, w), b1=L.BatchNormalization(n_out), b2=L.BatchNormalization(n_out) )
I have checked the Convolution2D's source code , the parameter w means a scale.
The problem is that I don't know why it is setted to sqrt(2).Could it be 1?
Thanks very much.
http://docs.chainer.org/en/stable/_modules/chainer/links/connection/convolution_2d.html#Convolution2D
wscale is only used for the initializer.
So this w is the scale used for initializing the weights with gaussian noise. So w is used only during initialization and during training and execution of the model it becomes irrelevant. My guess would be that the actually value is more or less empirically chosen as a trade-off between initial noisiness and training time.
If you're willing to wait longer, you could try setting it even lower, so that the NN starts out with a lower response (= more gray) but also with less noise, but then it might take longer for the NN to learn to produce full-amplitude outputs.
@fxtentacle Thanks a lot.I have understood it.