DiffGAN-TTS
DiffGAN-TTS copied to clipboard
PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Hi @keonlee9420, this software depends on [`praat-parselmouth`](https://github.com/YannickJadoul/Parselmouth) which is GPL-licensed, which means all software that depends on it must also be GPL-licensed. Might it be possible to switch to [DeepPhonemizer](https://github.com/as-ideas/DeepPhonemizer),...
When I am trying to run inference with the VCTK dataset, I am getting this error. Do we need to prepare speaker embeddings ourselves even with pre-trained VCTK models? Thanks!...
I encountered some problems again VCTK dataset, I followed the process but UnboundLocalError: local variable 'f0' referenced before assignment, I wonder if it is possible to package the VCTK dataset...
Hi Keonlee, May I know how many days taken you to train Diff GAN-TTS and please give information regarding GPU specification.
Dear Keon Lee, I am a research assistant at the City University of Hong Kong, I currently conduct research related to neurolinguistics and appreciate your work about text to speech...
hi, when i use VCTK dataset, process has a problem called "UnboundLocalError: local variable 'f0' referenced before assignment" but using LJSpeech is ok. By the way ,when i train the...
Hi, thank you very much for your great work! I was wondering if you conduct any evaluations on the model performance and voice quality for multi-speaker results, e.g. MOS or...
Hi@keonlee9420, I encountered some problems during the training stage. I often have loss functions that occasionally fluctuate a lot during training, even from around 3 to tens or hundreds. After...
Hi, keonlee. Thanks for sharing code! I found that when training aux model, we get \hat{x_0} from G, then diffuse it to \hat{x_1}, finally get a prediciton list [ \hat{x_0},...