cdvae-vc
cdvae-vc copied to clipboard
Quality of converted speeches
Hi, I synthesized converted speeches of this three models, VAE, CDVAE and CDVAE-CLS-GAN separately. The results of CDVAE-CLS-GAN model sound worst. Is it supposed to be like this? Or anything I missed?
Hi,
Thank you for your interest in this repo!
According to our subjective evaluation in our journal paper, CDVAE-CLS-GAN should at least be comparable with CDVAE (with GV).

If possible, could you upload your converted samples so that I can inspect what happened? Or, you can compare your results with the demo.