segan icon indicating copy to clipboard operation
segan copied to clipboard

Could this model be called a real GAN? The discriminator might contribute nothing to the performance.

Open ANYMS-A opened this issue 6 years ago • 1 comments

Hi there, recently I'm trying to reproduce this SEGAN model and find out some questions.

The biggest question is about the loss function of the discriminator. As we know the original GAN's discriminator is doing binary classification task. So it use a Sigmoid at the last output layer and Binary Cross Entropy as the loss function. For this model's discriminator it seems it is doing a regression task, the loss function is trying to minimize the distance between outputs and 1 (or 0). So I think the discriminator contributes nothing to the final performance. minimizing L1 loss between clean speech and generated speech make the whole system work.

So I discarded the discriminator and only train the generator for speech enhancement, it gives a very close performance of SEGAN. If only use the generator for training, the model could be seen as a de-noising auto encoder.

3.I'm kind of confused about that how much does the discriminator contribute to the final performance during the Adversarial Process. Because for speech enhancement task, we are not 'generate' basically but 'mapping' noisy signal to clean signal.

Many thanks!

ANYMS-A avatar Aug 15 '19 17:08 ANYMS-A

I think gan loss contributes high-frequency band. without gan loss, mse loss or l1 loss don't catch enough high-freq information due to low-power of the high-freq.

JUiscoming avatar Jun 18 '20 17:06 JUiscoming