CosyVoice icon indicating copy to clipboard operation
CosyVoice copied to clipboard

Other languages

Open cseti007 opened this issue 1 year ago • 6 comments

Thanks for your great work! I'm just wondering how big dataset is recommended from training from scratch for other languages?

Thank you!

cseti007 avatar Jul 24 '24 05:07 cseti007

I've had success training in Spanish with ~70 hours. But I'm getting an issue where proper nouns aren't being said properly. And the pronunciation isn't always ideal

rlenain avatar Jul 24 '24 15:07 rlenain

of course you can, check whisper tokenizer and add <|your language|> at sentence start

aluminumbox avatar Jul 25 '24 09:07 aluminumbox

@aluminumbox i'm getting a weird issue in spanish where proper nouns / uncommon words aren't being said properly - think it might be a tokenizer issue. do you have any idea how the BPE tokenizer would react to a new language and a reason why it would struggle with proper nouns / uncommon words?

rlenain avatar Jul 25 '24 09:07 rlenain

@aluminumbox i'm getting a weird issue in spanish where proper nouns / uncommon words aren't being said properly - think it might be a tokenizer issue. do you have any idea how the BPE tokenizer would react to a new language and a reason why it would struggle with proper nouns / uncommon words?

we use whisper tokenizer, check cosyvoice.yaml, we also do not have enough experience in spanish tokenization

aluminumbox avatar Jul 25 '24 09:07 aluminumbox

hello @rlenain, are you training only llm model or also flow model? and how much GPU resources you use for Spanish training.

drlor2k avatar Aug 29 '24 10:08 drlor2k

hi @aluminumbox , do you think it's better to train cosyvoice from scratch or just finetune the CosyVoice-300M base model if I want to train on new language? Also, should I train both llm and flow if I want to finetune it?

justinatbahasa avatar Oct 09 '24 12:10 justinatbahasa

I've had success training in Spanish with ~70 hours. But I'm getting an issue where proper nouns aren't being said properly. And the pronunciation isn't always ideal

@rlenain Have you solved this issue? Which models did you train? Could you please share your code?

ukemamaster avatar Dec 30 '24 23:12 ukemamaster

if anyone know of a guide to finetune / train our own models based in other languages please share a link. My idea is to evaluate to check how this project compares to other TTS solutions out there, then stick with one instead of having using different solutions.

I see we have a lot of languages here that in theory we could train.

juniormayhe avatar Dec 30 '24 23:12 juniormayhe

I've had success training in Spanish with ~70 hours. But I'm getting an issue where proper nouns aren't being said properly. And the pronunciation isn't always ideal

Hi @rlenain , What is your data format? and which code did you follow to add spanish language?

ukemamaster avatar Jan 03 '25 17:01 ukemamaster

Hi @rlenain , trying to finetune also, could you share your steps?

SerdarBayraktar avatar Jul 06 '25 21:07 SerdarBayraktar