improved-gan icon indicating copy to clipboard operation
improved-gan copied to clipboard

disc_param_avg

Open TornadoM opened this issue 7 years ago • 2 comments

What is disc_param_avg used for here? Why its updates should be considered into parameters update. If gradients already be got by disc_param_updates, why couldn't we directly apply these gradients to the parameters in layers?

Thanks for your answer very much!

TornadoM avatar Jun 21 '17 12:06 TornadoM

It seems that disc_param_avg could be used to calculate the "historical averaging" regularization term in both the discriminator and the generator's costs. See section 3.3 of https://arxiv.org/pdf/1606.03498.pdf

However, since disc_param_avg appears nowhere in the cost in this code, I don't think the authors have implemented historical averaging here.

Instead, I think disc_param_avg is just a temporally smoothed set of parameters. It's used at test time to give better, more stable results:

test_batch = th.function(inputs=[x_lab,labels], outputs=test_err, givens=disc_avg_givens)

https://github.com/openai/improved-gan/blob/master/mnist_svhn_cifar10/train_cifar_feature_matching.py#L104

christiancosgrove avatar Jun 23 '17 16:06 christiancosgrove

@christiancosgrove Thanks very much for your explanation. Now I understand its function here. However, I thought it has been applied during the process.

train_batch_disc = th.function(inputs=[x_lab,labels,x_unl,lr], outputs=[loss_lab, loss_unl, train_err], updates=disc_param_updates+disc_avg_updates)

https://github.com/openai/improved-gan/blob/master/mnist_svhn_cifar10/train_cifar_feature_matching.py#L103

TornadoM avatar Jun 23 '17 20:06 TornadoM