Puyuan Peng
Puyuan Peng
> @jasonppy Yes 52-55 % start producing good voice, I will also let you know it works great with multi-lingual data. I finetuned this on 3 lang data and even...
Thanks for the efforts Chenxi! This looks like an excellent demo. Right now it seems that it's missing a few things: speech recognition, word alignment (without word alignment it's hard...
Thanks! The demo looks great I have a few comments: 1. MFA is very slow. It can take 80% of the time when running the system, and if switching to...
Thanks! really nice work! I have tested both tts and speech editing-substitution and they worked out well. Three more requests: 1. is it possible to remove submodule audiocraft? I can...
I don't have a good solution for that right now. It might require some model development One middle ground is to concatenate the original prompt and the previously generated sentence...
Thanks Inference: For the default example in the demo (the one in inference_tts.ipynb), For the 830M model, it needs around 22GB with kvcache on (i.e. kvcache=1), 12GB with kvcache off;...
Yes we will make the model support Chinese in the future
Thanks for your interests! Model parameters will be released by the end of this month.
Thanks! Could you post the issues you encountered?
Thanks! This is very helpful. I updated the repo with new instructions on setup and also a new environment.yml file. The transformers version is 4.38.2 in my environment and it...