tacotron icon indicating copy to clipboard operation
tacotron copied to clipboard

How to improve text input notation(Question)

Open dokuzbir opened this issue 6 years ago • 1 comments

For example for "blue" tacotron output voice speed is fast. What i want is slow some areas of word. For example "bluuue" instead of "blue". But when i input "bluuue" that changes pronunciation ofoutput strangely not like blue. How can i achieve that?

dokuzbir avatar Feb 09 '19 08:02 dokuzbir

I think you have a few options:

  1. Collect some training data with words spoken at different speeds, annotate the words with the speed, and train a model on that.

  2. Train a model using a phonetic alphabet and then insert additional phonemes into the words that you want to stretch out. See the CMUDict file in this repo for an example of translating to phonemes: https://github.com/keithito/tacotron/blob/master/text/cmudict.py

  3. Post-process the audio to slow down the words.

keithito avatar Feb 25 '19 05:02 keithito