pytorch-spectral-normalization-gan icon indicating copy to clipboard operation
pytorch-spectral-normalization-gan copied to clipboard

How does _u and _v update?

Open luhaofang opened this issue 6 years ago • 1 comments

Thanks for your clear implementation. I encounter the problem about the _u and _v update policy. I've noticed that in your implementation, _u is updated before op's inference phase, does _u need back propagate update by the gradient? Another problem is, should I update the gradient created by w_bar to original weight directly? I found you mentioned the point, but I think update to w_bar seems more reasonable, and during the next iteration taking w_bar as original weight, am I right?

luhaofang avatar Nov 20 '18 06:11 luhaofang

Q1: In the nondifferentiable spectral normalization layer spectral_normalization_nondiff.py we do not backpropagate the gradients. This is because during inference, we overwrite the current weights.

In the differentiable layer spectral_normalization.py, we modify the computation graph by creating new parameters *_u, *_v, and *_bar for every weight w. The parameter w is replaced with w_bar, which allows gradients to flow to *_u (and to the original weight w) during backpropagation.

It's worth nothing that in my experiments, I didn't find a major difference between these two implementations.

Q2:

In the differentiable implementation, the gradient with respect to w_bar will be used to compute the gradient with respect to w, _u and _v. In the nondifferentiable implementation, the gradient updates w directly, which is then normalized during the next inference step.

christiancosgrove avatar Nov 23 '18 16:11 christiancosgrove