Puyuan Peng comments

Results 97 comments of


                                            Puyuan Peng

Finetuning

> @jasonppy Yes 52-55 % start producing good voice, I will also let you know it works great with multi-lingual data. I finetuned this on 3 lang data and even...

Add Replicate demo and API

Thanks for the efforts Chenxi! This looks like an excellent demo. Right now it seems that it's missing a few things: speech recognition, word alignment (without word alignment it's hard...

Add Replicate demo and API

Thanks! The demo looks great I have a few comments: 1. MFA is very slow. It can take 80% of the time when running the system, and if switching to...

Add Replicate demo and API

Thanks! really nice work! I have tested both tts and speech editing-substitution and they worked out well. Three more requests: 1. is it possible to remove submodule audiocraft? I can...

Generating long speeches

I don't have a good solution for that right now. It might require some model development One middle ground is to concatenate the original prompt and the previously generated sentence...

Question: VRAM requirements for training, finetuning, and inference?

Thanks Inference: For the default example in the demo (the one in inference_tts.ipynb), For the 830M model, it needs around 22GB with kvcache on (i.e. kvcache=1), 12GB with kvcache off;...

support Chinese?

Yes we will make the model support Chinese in the future

Where are the model weights ?

Thanks for your interests! Model parameters will be released by the end of this month.

Request for a requirements.txt

Thanks! Could you post the issues you encountered?

Request for a requirements.txt

Thanks! This is very helpful. I updated the repo with new instructions on setup and also a new environment.yml file. The transformers version is 4.38.2 in my environment and it...