pytorch-spectral-normalization-gan
pytorch-spectral-normalization-gan copied to clipboard
How does _u and _v update?
Thanks for your clear implementation. I encounter the problem about the _u and _v update policy. I've noticed that in your implementation, _u is updated before op's inference phase, does _u need back propagate update by the gradient? Another problem is, should I update the gradient created by w_bar to original weight directly? I found you mentioned the point, but I think update to w_bar seems more reasonable, and during the next iteration taking w_bar as original weight, am I right?
Q1:
In the nondifferentiable spectral normalization layer spectral_normalization_nondiff.py
we do not backpropagate the gradients. This is because during inference, we overwrite the current weights.
In the differentiable layer spectral_normalization.py
, we modify the computation graph by creating new parameters *_u
, *_v
, and *_bar
for every weight w
. The parameter w
is replaced with w_bar
, which allows gradients to flow to *_u
(and to the original weight w
) during backpropagation.
It's worth nothing that in my experiments, I didn't find a major difference between these two implementations.
Q2:
In the differentiable implementation, the gradient with respect to w_bar will be used to compute the gradient with respect to w
, _u
and _v
. In the nondifferentiable implementation, the gradient updates w
directly, which is then normalized during the next inference step.