Puyuan Peng
Puyuan Peng
Thank you so much! Will test it soon
check out https://github.com/jasonppy/VoiceCraft?tab=readme-ov-file#finetuning, still need to figure out data preparation in training section to perform finetuning
Thanks! will put an update for it this week
Thanks! about supporting. I appreciate that you like the demo and want to incorporate voicecraft into your nodejs system. It shouldn't be hard to wrap what's in the jupyter notebook...
How long target transcript? The model is trained on short sentences (evarage length 5 sec, although the longest training data goes to 20sec), so you might want to finetune it...
> Thank you! I was using reference audio up to 12 seconds long + target transcript which is about 4 seconds long. > > I’ll try using a reference which...
Some times the speaker similarity can be a bit off, it's like the model uses a different voice than the prompt. One thing that I found can improve speaker similarity...
The TTS finetuned 330M model is up, should be better than the 830M one
Thanks! I haven't tried finetuning a lot. You could use the 330M model. I tried finetuning on as small as 550h libritts and it seems that it doesn't overfit -...
Looks pretty good! I think it's worth generating a few sample. I don't really know what is a good value for loss and top10, as different data have different levels...