sngan_projection
sngan_projection copied to clipboard
Batch normalization with SN
@takerum I have a question about a batch normalization layer with the spectral normlization.
The following discussion ignores biases parameters as they do not affect the Lipschitz constant.
Generally speaking, a batch normalization is regarded as a linear transformation with a diagonal matrix W
with W_{i, i} = w_i = gamma_i / sqrt(sigma_i ^ 2 + epsilon) > 0
, gamma_i
, sigma_i ^ 2
and epsilon
are corresponding to a scaling constant, a running average of variance of input x_i and small constant respectively.
Hence, the spectral norm of this diagonal matrix W
is its maximum diagonal element, e.g. max(w_i)
.
Naively adapting SN to a batch normalization layer, W' = W / max(w_i)
is obtained as a spectrally normalized batch normalization matrix, but W'
seems to fail to batch-normalize inputs.
What is the most reasonable way to adapt SN to a batch normalization layer in your framework?
but W' seems to fail to batch-normalize inputs.
Sorry but I cannot get what you mean... Would you explain why you don't think W' is favorable?