Parenthases (and some other characters) lead to keyerror on inference
Let's say I attempt inference on the following sentence:
I am happy not sad
This resolves perfectly fine.
However, If i attempt inference on:
I am happy (not sad)
I get:
Traceback (most recent call last):
File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/inference.py", line 43, in <module>
stn_tst = get_text(txt, hps)
File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/inference.py", line 20, in get_text
text_norm = text_to_sequence(text, hps.data.text_cleaners)
File "/mnt/bc457ffc-58c4-4dfe-a922-2b44ae3fa37e/vits/text/__init__.py", line 23, in text_to_sequence
symbol_id = _symbol_to_id[symbol]
KeyError: '('
Infact, any characters not in the following list will generate this exception:
['_', ';', ':', ',', '.', '!', '?', '¡', '¿', '—', '…', '"', '«', '»', '“', '”', ' ', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'ɑ', 'ɐ', 'ɒ', 'æ', 'ɓ', 'ʙ', 'β', 'ɔ', 'ɕ', 'ç', 'ɗ', 'ɖ', 'ð', 'ʤ', 'ə', 'ɘ', 'ɚ', 'ɛ', 'ɜ', 'ɝ', 'ɞ', 'ɟ', 'ʄ', 'ɡ', 'ɠ', 'ɢ', 'ʛ', 'ɦ', 'ɧ', 'ħ', 'ɥ', 'ʜ', 'ɨ', 'ɪ', 'ʝ', 'ɭ', 'ɬ', 'ɫ', 'ɮ', 'ʟ', 'ɱ', 'ɯ', 'ɰ', 'ŋ', 'ɳ', 'ɲ', 'ɴ', 'ø', 'ɵ', 'ɸ', 'θ', 'œ', 'ɶ', 'ʘ', 'ɹ', 'ɺ', 'ɾ', 'ɻ', 'ʀ', 'ʁ', 'ɽ', 'ʂ', 'ʃ', 'ʈ', 'ʧ', 'ʉ', 'ʊ', 'ʋ', 'ⱱ', 'ʌ', 'ɣ', 'ɤ', 'ʍ', 'χ', 'ʎ', 'ʏ', 'ʑ', 'ʐ', 'ʒ', 'ʔ', 'ʡ', 'ʕ', 'ʢ', 'ǀ', 'ǁ', 'ǂ', 'ǃ', 'ˈ', 'ˌ', 'ː', 'ˑ', 'ʼ', 'ʴ', 'ʰ', 'ʱ', 'ʲ', 'ʷ', 'ˠ', 'ˤ', '˞', '↓', '↑', '→', '↗', '↘', "'", '̩', "'", 'ᵻ']
This is because the list of symbols are the only ones that the model has been trained with, any that are not in that list will generate an error.
One solution is to add more symbols to avoid these errors, this I think is correct if in the dataset those symbols have enough representation, otherwise it can generate artifacts.
Another solution and it is the one that I have used is to add a preprocessing of the text before passing it to the model for inference, this way you make sure that there is no character that is not in the model.