Can StyleTTS2 use phonemization from different languages to finetune or train?

Open tanishbajaj101 opened this issue 1 year ago • 1 comments

I am trying to make a model that can accomodate words from both my native language, along with words from English. My native language is Hindi, written in devanagari script

Here is an example "मेरा car service 21 जुलाई को scheduled है, क्या आप मुझे इसके सभी details दे सकते हैं?" of what I want

This is the phonemized output, that gives output in 2 diff. languages at the same time (hi)meːɾaː(enus) kɑːɹ sɜːvɪs twɛnti wʌn (hi)ɟʊlaːi koː(enus) skɛdʒuːld (hi)hɛː(enus) (hi)kːjaː aːp mʊɟʰeː ɪskeː sʌbʰi(enus) diːteɪlz (hi)deː sʌkteː hɛ̃(enus) Here it detects two different scripts and marks them as (hi) for hindi and (enus) for american english?

If i record appropriate sounds, would I be able to train the model appropriately?

Jul 19 '24 05:07 tanishbajaj101

I am working on a similar project. There are a few Turkish words within English texts. Is it possible for TTS to read both English and Turkish together? For example, can a sentence like "After Atatürk Caddesi(Turkish words), you need to go one more kilometer" be read realistically? If this is possible, should I train it only with English audio files containing Turkish words, or should I also include audio files with only Turkish words in the training?

Jan 31 '25 09:01 fkaplan