f-AnoGAN-PyTorch icon indicating copy to clipboard operation
f-AnoGAN-PyTorch copied to clipboard

Encoder Tanh

Open huberb opened this issue 4 years ago • 2 comments

Hello,

I believe there is an error in the encoder architecture? The last layer of the encoder uses Tanh, which does not make sense to me since the generator input is torch.randn(64, 128), which can output values below -1 and above 1. The tanh activation is limiting the encoder output to -1 and 1 though. This might explain why the images generated in the izif training aren't that good.

Am I missing something?

huberb avatar Mar 03 '21 11:03 huberb

Hi, Placing a tanh activation in the last layer of encoder was proposed in the original paper. The authors said it can slightly improve the overall performance. And in the official tensorflow implementation they used the tanh too.

dbbbbm avatar Mar 03 '21 12:03 dbbbbm

Oh you're right, I've found this in the original paper:

During encoder training, we restrict the encoder to map normal images into the range (−1σ, +1σ) of the standard normal distribution by implementing a tanh activation function on the output layer of the encoder

However, they are still using σ as a scaling factor. in the original implementation different scaling factors are tested. I've been wondering because when training the ziz model architecture I can't get the loss close to 0 and i thought this might be the reason.

We can still close this if you think this is correct. Thanks for the insight

huberb avatar Mar 03 '21 12:03 huberb