TransformerTTS icon indicating copy to clipboard operation
TransformerTTS copied to clipboard

No numbers in phonemes set and collapse of whitespaces

Open anh opened this issue 3 years ago • 2 comments

When using phonemizer (espeak-ng) there are digits to reflex the vowel/sound variants like the following:

text = 'Có lối ra, chúng ta qua đó xem sao.'
phonemizer.phonemize(
    text,
    language='vi',
    backend='espeak',
    strip=False,
    preserve_punctuation=True,
    punctuation_marks=';:,.!?¡¿—…"«»“”',
    with_stress=True,
    language_switch='keep-flags',
    njobs=1
)

output:

'ɡˈɔɜ lˈoɪɜ zˈaː7 , tɕˈuɜŋ t̪ˈaː1 wˈaː1 ɗˈɔɜ sˈɛ1m ʂˈaːʊ7 .'

with tokenizer._postprocess:

text = ''.join([c for c in text if c in all_phonemes]) # --> will remove numbers which are not in phonemes set 
text = _collapse_whitespace(text)

output:

ɡˈɔɜ lˈoɪɜ zˈaː,tɕˈuɜŋ tˈaː wˈaː ɗˈɔɜ sˈɛm ʂˈaːʊ.

Outputs placed together:

ɡˈɔɜ lˈoɪɜ zˈaː7 , tɕˈuɜŋ t̪ˈaː1 wˈaː1 ɗˈɔɜ sˈɛ1m ʂˈaːʊ7 .'
ɡˈɔɜ lˈoɪɜ zˈaː,tɕˈuɜŋ tˈaː wˈaː ɗˈɔɜ sˈɛm ʂˈaːʊ.

My question is the missing of numbers (here 7, 1) and spaces surround punctuation like comma as in zˈaː,tɕˈuɜŋ tˈaː instead of zˈaː7 , tɕˈuɜŋ t̪ˈaː1 will affect the aligment and pause beetween generated words?

anh avatar May 17 '21 05:05 anh

Hi, the whitespace collapse is a wanted effect, mostly to be able to control where the pauses are allocated with the forward model. You can remove this if you want by removing it from line 91 in data/text/tokenizer.py (return the line above). But I would discourage that, unless you're running into problems. For the numbers issue, you can add the missing phonemes (for instance 1,2,3,4,5,,6,7,8,9,0) in data/text/symbols.py in all phonemes like so: all_phonemes = sorted(list(_phonemes) + list(_punctuations) + list('1234567890') I was not aware that some languages had numbers as phonemes.

TODO: Add optional extra phonemes string to data_config.yaml

cfrancesco avatar May 17 '21 08:05 cfrancesco

Thank you for your clarification and making phonemes configurable is super helpful. I'll try your suggestion.

anh avatar May 17 '21 08:05 anh