MMS - Incorrect vocab.txt for Korean TTS. Results in nothing getting past the filter
Korean TTS downloaded the model with - wget https://dl.fbaipublicfiles.com/mms/tts/kor.tar.gz
after running infer.py, there is no text after the line - text after filtering OOV:
Checked the vocab.txt and it's incorrect for Korean. Shows English letters. downloaded again to be sure but still wrong.
We use uroman encoding for Korean TTS and thus inputting raw Korean characters will raise OOV. Will push a fix for uroman conversion soon.
@chevalierNoir Thanks. any plans on releasing code to finetune the TTS models ?
@ogkalu2 the issue should be fixed now. Please try again. For training, we use the original VITS code.
@ogkalu2 the issue should be fixed now. Please try again. For training, we use the original VITS code.
oov filtering turkish language letter like "Ö" "Ç" "İ" "Ğ"
@serkandyck this is possibly an encoding problem of your input text. I tried this sentence and it seems working fine.
Raw: Çalışkan İpek, öğleden sonra ödevini özenle tamamladı.
Filtered: çalışkan i̇pek öğleden sonra ödevini özenle tamamladı
This is the command:
PYTHONPATH=$PYTHONPATH:/path/to/vits python examples/mms/tts/infer.py \
--model-dir /path/to/tur/ --wav ./tur.wav \
--txt "Çalışkan İpek, öğleden sonra ödevini özenle tamamladı"
@chevalierNoir It works now. Thank you
@serkandyck this is possibly an encoding problem of your input text. I tried this sentence and it seems working fine. Raw: Çalışkan İpek, öğleden sonra ödevini özenle tamamladı. Filtered: çalışkan i̇pek öğleden sonra ödevini özenle tamamladı
This is the command:
PYTHONPATH=$PYTHONPATH:/path/to/vits python examples/mms/tts/infer.py \ --model-dir /path/to/tur/ --wav ./tur.wav \ --txt "Çalışkan İpek, öğleden sonra ödevini özenle tamamladı"
I looked into the problem a bit and saw that it was not reading the vocab file as UTF-8, I also sent the problem solution as a pull request. https://github.com/facebookresearch/fairseq/pull/5148#issue-1724265247
https://github.com/facebookresearch/fairseq/blob/25c20e6a5e781e4ef05e23642f21c091ba64872e/examples/mms/tts/infer.py#L27