jets icon indicating copy to clipboard operation
jets copied to clipboard

Training/running on custom speaker?

Open youssefabdelm opened this issue 2 years ago • 3 comments

Hi! Thanks for your amazing work! I'm curious to know, if I want the best performance possible (on a new speaker), do I need to train this model on their audio recordings? If so, I'm curious with the 4x V100 you mentioned, how long would I need to fine-tune the existing model on the files I have?

Also how big of a dataset do I need of a single speaker?

youssefabdelm avatar Jul 31 '22 15:07 youssefabdelm

The model is not intended to work on unseen speaker, so it should be trained on target speaker. If you mean 'training from scratch' vs 'fine-tuning' for new speaker, I expect fine-tuning works better on < 10k samples. I just used available V100 x4 gpu and trained about a week on LJSpeech, so wouldn't it be enough within half week for fine-tuning? Sorry I don't have various experience such as minimal dataset size.

imdanboy avatar Aug 05 '22 14:08 imdanboy

Oh sorry I didn't know that! So in the case of training from scratch then would it need around 10K examples? How many examples did you use? (and how long is each? 5 seconds, 10 seconds, etc.?)

Also did you train on 44.1kHz, and is it possible to do that? Would I have to make the model larger?

youssefabdelm avatar Aug 06 '22 13:08 youssefabdelm

Sorry for late reply. Because I have limited experience, i can say for sure only what i've done.

The experiment I've done was conducted on LJSpeech dataset which has 13,100 samples and 6.5s mean duration as you can check.

Although I downsampled to 22.05kHz if 44.1kHz, I expect 44.1kHz also work. And model performance may depend on various factor (dataset, model size, hyper-parameters, ...) Since JETS is also implemented on ESPnet, if you want quick experiment, you can follow the recipe using configuration of JETS: https://github.com/espnet/espnet/blob/master/egs2/ljspeech/tts1/conf/tuning/train_jets.yaml

imdanboy avatar Aug 24 '22 06:08 imdanboy