examples
examples copied to clipboard
DCGAN BatchNorm initialization weight looks different
Hi there,
I used the torch.utils.tensorboard
to watch the weight/grad when training the DCGAN example on MNIST dataset.
In the DCGAN example, we use the normal distribution to initialize both the weight of Conv and BatchNorm. However, I find it is strange when I visualize the weight of them. In the following figure, it seems that the G/main/1/weight
(BatchNorm) is not initialized with the normal distribution because it looks so different from G/main/0/weight
(ConvTranspose2d). It has been trained for 10 iters with batch size 64.
Could someone explain this?
The related tensorboard code is copied from here:
# logging weight and grads
for tag, value in netD.named_parameters():
tag = 'D/' + tag.replace('.', '/')
writer.add_histogram(tag, value.data.cpu().numpy(), global_step)
writer.add_histogram(tag+'/grad', value.grad.data.cpu().numpy(), global_step)
for tag, value in netG.named_parameters():
tag = 'G/' + tag.replace('.', '/')
writer.add_histogram(tag, value.data.cpu().numpy(), global_step)
writer.add_histogram(tag+'/grad', value.grad.data.cpu().numpy(), global_step)