vits icon indicating copy to clipboard operation
vits copied to clipboard

Parenthases (and some other characters) lead to keyerror on inference

Open GhostDog98 opened this issue 8 months ago • 1 comments

Let's say I attempt inference on the following sentence: I am happy not sad This resolves perfectly fine.

However, If i attempt inference on: I am happy (not sad) I get:

Traceback (most recent call last):
  File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/inference.py", line 43, in <module>
    stn_tst = get_text(txt, hps)
  File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/inference.py", line 20, in get_text
    text_norm = text_to_sequence(text, hps.data.text_cleaners)
  File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/text/__init__.py", line 23, in text_to_sequence
    symbol_id = _symbol_to_id[symbol]
KeyError: '('

Infact, any characters not in the following list will generate this exception:

['_', ';', ':', ',', '.', '!', '?', '¡', '¿', '—', '…', '"', '«', '»', '“', '”', ' ', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'ɑ', 'ɐ', 'ɒ', 'æ', 'ɓ', 'ʙ', 'β', 'ɔ', 'ɕ', 'ç', 'ɗ', 'ɖ', 'ð', 'ʤ', 'ə', 'ɘ', 'ɚ', 'ɛ', 'ɜ', 'ɝ', 'ɞ', 'ɟ', 'ʄ', 'ɡ', 'ɠ', 'ɢ', 'ʛ', 'ɦ', 'ɧ', 'ħ', 'ɥ', 'ʜ', 'ɨ', 'ɪ', 'ʝ', 'ɭ', 'ɬ', 'ɫ', 'ɮ', 'ʟ', 'ɱ', 'ɯ', 'ɰ', 'ŋ', 'ɳ', 'ɲ', 'ɴ', 'ø', 'ɵ', 'ɸ', 'θ', 'œ', 'ɶ', 'ʘ', 'ɹ', 'ɺ', 'ɾ', 'ɻ', 'ʀ', 'ʁ', 'ɽ', 'ʂ', 'ʃ', 'ʈ', 'ʧ', 'ʉ', 'ʊ', 'ʋ', 'ⱱ', 'ʌ', 'ɣ', 'ɤ', 'ʍ', 'χ', 'ʎ', 'ʏ', 'ʑ', 'ʐ', 'ʒ', 'ʔ', 'ʡ', 'ʕ', 'ʢ', 'ǀ', 'ǁ', 'ǂ', 'ǃ', 'ˈ', 'ˌ', 'ː', 'ˑ', 'ʼ', 'ʴ', 'ʰ', 'ʱ', 'ʲ', 'ʷ', 'ˠ', 'ˤ', '˞', '↓', '↑', '→', '↗', '↘', "'", '̩', "'", 'ᵻ']

GhostDog98 avatar Apr 15 '25 05:04 GhostDog98

This is because the list of symbols are the only ones that the model has been trained with, any that are not in that list will generate an error.

One solution is to add more symbols to avoid these errors, this I think is correct if in the dataset those symbols have enough representation, otherwise it can generate artifacts.

Another solution and it is the one that I have used is to add a preprocessing of the text before passing it to the model for inference, this way you make sure that there is no character that is not in the model.

Pipe1213 avatar Apr 16 '25 10:04 Pipe1213