Indic-TTS icon indicating copy to clipboard operation
Indic-TTS copied to clipboard

Fine-Tuning Guide to Add a New Speaker

Open harshvardhan-truefan opened this issue 2 years ago • 10 comments

Hi, Really love the work done in this repo, it has been really helpful. Just a request, could you please add more documentation regarding fine-tuning the models for a new voice, using available model checkpoints. It is not very clear about how to fine-tune the model on a new dataset.

Thanks in advance!

Regards, Harsh

harshvardhan-truefan avatar Sep 06 '23 12:09 harshvardhan-truefan

+1

ShyamGadde avatar Oct 07 '23 08:10 ShyamGadde

+1

ultralegendary avatar Dec 04 '23 18:12 ultralegendary

I'd like one too, I haven't explored the codebase yet but I think it's based in coqui TTS. Hence similar training and fine-tuning methods would apply is my guess. There are some resources online detailing how to add a new speaker for coqui. I'll explore further and keep you guys posted.

h2210316651 avatar Mar 24 '24 06:03 h2210316651

@h2210316651 did you try to figureout the finetuning of indic tts?

sachin7695 avatar Aug 17 '24 06:08 sachin7695

@sachin7695 I have explored coqui in depth and decided it's way too complex for me to be using it. I have switched to rvc V2 I just TTS the content I want synthesized, then i use rvc to change the speaker voice, all you need is a 15 minute sample for pretty good quality voice clone.

h2210316651 avatar Aug 17 '24 06:08 h2210316651

@sachin7695 there's another project called applio, please check

h2210316651 avatar Aug 17 '24 06:08 h2210316651

@h2210316651 i tried xttsv2 coqui for fine tuning on hindi language and i was able to do that, i explored that in depth but the thing is the fastpitch of indic tts and coqui tts fastpitch is somewhat different when i checked the model state thats why i am pretty intersted to know indic tts fine tuning. anyways thanks a lot for your reply.

sachin7695 avatar Aug 17 '24 07:08 sachin7695

@h2210316651 bro i finetuned coqui xttsv2 for different indic language such as odia, bangla if at all you want to discover just ping me.. you just need to train the tokenizer (Byte pair encoding model) with indic transcription

sachin7695 avatar Aug 21 '24 17:08 sachin7695

@sachin7695 any resources on how to fine tune xttsv2 coqui that I should follow to finetune for indic languages single speaker. Also have you tried fine tuning indic TTS?

shrey802 avatar Mar 11 '25 10:03 shrey802

@shrey802 indicTTS finetuning code is not available i am confused in the training of fastPitch of indicTTS did not get any resource for that i tried to finetune fastpitch model separately and integrate with indicTTS but encountered some errors!! you can ping me on linkedin (attached in my github profile) if you want assistance in fine tuning xttsv2 for indic languages

sachin7695 avatar Mar 11 '25 10:03 sachin7695