Self-Attention-GAN-Tensorflow
Self-Attention-GAN-Tensorflow copied to clipboard
About SN & BN
Hi,
Great work! I’m wondering if applying BN with scale=True make the network no longer be a Lipschitz-1 function (Which should be the target of SN?)
In the paper of SN, we treat the entire network as a function and calculate its Lipschitz constant by multiplying every Lipschitz constant of each component And since BN with scale=True introduces an additional scaling parameter, we should also take it into consideration when calculating its spectral norm, right? If so, applying BN after SN seems to destroy the work done by SN?