vits
vits copied to clipboard
How to finetune the given pre-trained model?
I was looking to finetune the vits model using some custom data, but noticed that the released model is only the generator, and discriminator is also required to continue training from the checkpoint. Is the discriminator model available somewhere or is there any way to finetune the available model?
No discriminator model, so no way for fine-tuning.
No discriminator model, so no way for fine-tuning.
The pre-trained model in this repo might not have a discriminator but the one at coqui-tts does have it. And I get good results when fine-tuning that vits model to a new voice with few steps.
No discriminator model, so no way for fine-tuning.
The pre-trained model in this repo might not have a discriminator but the one at coqui-tts does have it. And I get good results when fine-tuning that vits model to a new voice with few steps.
where is the coqui discriminator?
You can also use the hifi-gan pretrained discriminator. VITS uses hifi-gan's code. https://github.com/jaywalnut310/vits/blob/2e561ba58618d021b5b8323d3765880f7e0ecfdb/models.py#L364
The pre-trained model in this repo might not have a discriminator but the one at coqui-tts does have it. And I get good results when fine-tuning that vits model to a new voice with few steps.
@iamkhalidbashir Did you do fine-tuning with pretrained model on vctk (or LJ?) and with additional speaker? And about coqui-tts, can you tell where is the discriminator? Downloaded model look like generator.
Hi, everyone, following the suggestion from @CookiePPP , I have a fork with a running loop for finetuning(only for LJSpeech atm). I am using the LJSpeech generator from this repo and I extracted the discriminator from Hifi-Gan and amended the code slightly to make it work.
https://github.com/nivibilla/efficient-vits-finetuning
I would love to get help with finetuning as I'm currently limited to just using colab lol.