sngan_projection icon indicating copy to clipboard operation
sngan_projection copied to clipboard

Batch normalization with SN

Open aiueogawa opened this issue 6 years ago • 1 comments

@takerum I have a question about a batch normalization layer with the spectral normlization.

The following discussion ignores biases parameters as they do not affect the Lipschitz constant. Generally speaking, a batch normalization is regarded as a linear transformation with a diagonal matrix W with W_{i, i} = w_i = gamma_i / sqrt(sigma_i ^ 2 + epsilon) > 0, gamma_i, sigma_i ^ 2 and epsilon are corresponding to a scaling constant, a running average of variance of input x_i and small constant respectively. Hence, the spectral norm of this diagonal matrix W is its maximum diagonal element, e.g. max(w_i).

Naively adapting SN to a batch normalization layer, W' = W / max(w_i) is obtained as a spectrally normalized batch normalization matrix, but W' seems to fail to batch-normalize inputs. What is the most reasonable way to adapt SN to a batch normalization layer in your framework?

aiueogawa avatar Oct 08 '18 14:10 aiueogawa

but W' seems to fail to batch-normalize inputs.

Sorry but I cannot get what you mean... Would you explain why you don't think W' is favorable?

takerum avatar Oct 12 '18 06:10 takerum