improved-gan
improved-gan copied to clipboard
disc_param_avg
What is disc_param_avg used for here? Why its updates should be considered into parameters update. If gradients already be got by disc_param_updates, why couldn't we directly apply these gradients to the parameters in layers?
Thanks for your answer very much!
It seems that disc_param_avg
could be used to calculate the "historical averaging" regularization term in both the discriminator and the generator's costs. See section 3.3 of https://arxiv.org/pdf/1606.03498.pdf
However, since disc_param_avg
appears nowhere in the cost in this code, I don't think the authors have implemented historical averaging here.
Instead, I think disc_param_avg
is just a temporally smoothed set of parameters. It's used at test time to give better, more stable results:
test_batch = th.function(inputs=[x_lab,labels], outputs=test_err, givens=disc_avg_givens)
https://github.com/openai/improved-gan/blob/master/mnist_svhn_cifar10/train_cifar_feature_matching.py#L104
@christiancosgrove Thanks very much for your explanation. Now I understand its function here. However, I thought it has been applied during the process.
train_batch_disc = th.function(inputs=[x_lab,labels,x_unl,lr], outputs=[loss_lab, loss_unl, train_err], updates=disc_param_updates+disc_avg_updates)
https://github.com/openai/improved-gan/blob/master/mnist_svhn_cifar10/train_cifar_feature_matching.py#L103