ProDiff
ProDiff copied to clipboard
PyTorch Implementation of ProDiff (ACM-MM'22) with a Extremely-Fast diffusion speech synthesis pipeline
Is there a checkpoint for aishell3?
According to this post, Compatibility with gcc gfortran 10 requires scipy>=1.5.0: https://github.com/scipy/scipy/issues/11611#issuecomment-667819936 So I'm wondering if there is a reason to use scipy==1.3 as specified in requirements.txt. I dropped ==1.3...
AppData\Local\Temp\pip-install-r50u5l8a\praat-parselmouth_4b359ba8e3b34072ab1c1d1b3e3b0992\setup.py", line 44, in run raise RuntimeError("CMake must be installed to build the following extensions: " + ", ".join(e.name for e in self.extensions)) RuntimeError: CMake must be installed to build...
Hi! Thanks for the great work! I have a question: is it possible to control the speed of the speech, like Fastspeech2, VITS, and so forth? Thanks!
Very nice work! A little question is why the time of studnet is "t - self.time_scale", i think it may be "t"? Link: https://github.com/Rongjiehuang/ProDiff/blob/1d199bc504afed8756d62bf108c460b1e89330f8/modules/ProDiff/model/ProDiff.py#L172
How much data would we need to make a successful model that is able to generate high fidelity speech?
I want to train TTS model using your reposity but when I followed your repo, I didn't find _pre_align.py_ and _train_mfa_align.py_ ? Could you help me, please ?
Is this an end to end solution? Can this support other languages?