Quality of converted speeches

Open inconnu11 opened this issue 5 years ago • 1 comments

Hi, I synthesized converted speeches of this three models, VAE, CDVAE and CDVAE-CLS-GAN separately. The results of CDVAE-CLS-GAN model sound worst. Is it supposed to be like this? Or anything I missed?

Jun 02 '20 14:06 inconnu11

Hi,

Thank you for your interest in this repo! According to our subjective evaluation in our journal paper, CDVAE-CLS-GAN should at least be comparable with CDVAE (with GV).

If possible, could you upload your converted samples so that I can inspect what happened? Or, you can compare your results with the demo.

Jun 02 '20 17:06 unilight