ChatTTS icon indicating copy to clipboard operation
ChatTTS copied to clipboard

How Can I use my own voice to training the AI

Open CodeMagic6 opened this issue 1 year ago • 8 comments

I want to use my own voice to training the AI, but I don't know the steps.Can anyone write a torturial about it?

我不知道怎么使用自己的语音训练AI生成自己的语音,有人能写出详细的步骤吗?

CodeMagic6 avatar Jun 07 '24 03:06 CodeMagic6

same question, i need the english speaker to speak more singlish/chinese accent

cesinsingapore avatar Jun 07 '24 03:06 cesinsingapore

I also need clone my own voice

kuang-kuang avatar Jun 07 '24 04:06 kuang-kuang

This fork is trying to add fine tuning: https://github.com/ain-soph/ChatTTS

MethanJess avatar Jun 13 '24 03:06 MethanJess

This fork is trying to add fine tuning: https://github.com/ain-soph/ChatTTS

Does it support to train a new language? instead of fine-tune for cloning voice?

yoesak avatar Jun 14 '24 12:06 yoesak

This fork is trying to add fine tuning: https://github.com/ain-soph/ChatTTS

Does it support to train a new language? instead of fine-tune for cloning voice?

the fork is not clear how to do the fine tune

cesinsingapore avatar Jun 20 '24 01:06 cesinsingapore

This fork is trying to add fine tuning: https://github.com/ain-soph/ChatTTS

Does it support to train a new language? instead of fine-tune for cloning voice?

the fork is not clear how to do the fine tune

first finetune encoder, then finetune gpt

xpdd123 avatar Jun 20 '24 03:06 xpdd123

I’m the one who create that fork.
Yes, first train the encoder and then GPT. However, the current encoder loss cannot be optimized to be smaller than 0.1, making training unfeasible.

The possible solution is to modify current VQ encoder architecture and explore better training hyper-parameters. Or find a better dataset for training.

But I’m busy working, so … Hope anyone other could do that

ain-soph avatar Jul 05 '24 15:07 ain-soph

I’m the one who create that fork. Yes, first train the encoder and then GPT. However, the current encoder loss cannot be optimized to be smaller than 0.1, making training unfeasible.

The possible solution is to modify current VQ encoder architecture and explore better training hyper-parameters. Or find a better dataset for training.

But I’m busy working, so … Hope anyone other could do that

@ain-soph Can you tell me how to fine-tune encoder and then GPT? Unfortunately I am not able to find script of finetuning anywhere. If you could guide me in this it would be great.

vivek-kumar-vkb avatar Jan 07 '25 17:01 vivek-kumar-vkb