gst-tacotron Tone transfer

I want to know that this model is just to learn the rhythm of the statement you provide instead of the tone. Can I use this model to imitate the tone of his speech with a single sentence?

Jul 13 '18 05:07 switchzts

The style is learned in an unsupervised way, which means that there is no constraint to make the model only focus on prosody. If you read the other Google's paper, you will find it may also learn some speaker information.

Jul 13 '18 06:07 syang1993

@syang1993 Thanks for reply, Does it mean that the training data requires sentences of the same person's different rhythms? What is the data in Blizzard Challenge 2013? I am still downloading. Is it a training set for different rhythms of one speaker?

Jul 13 '18 06:07 switchzts

The Blizzard 2013 dataset is audio book data of a single speaker, which contains rich prosody. Besides, if you use neural data to train this model, the model will not learn the prosody information. It may work as traditional tacotron.

Jul 14 '18 05:07 syang1993

@syang1993 hi，thansk for your nice work。as you mentioned above:"Besides, if you use neural data to train this model, the model will not learn the prosody information. It may work as traditional tacotron". What do you mean: neural data? Now, learn from your published code, my model hardly learns the prosody information and how can I next

Sep 18 '18 03:09 GengwangGitHub

gst-tacotron gst-tacotron copied to clipboard

Tone transfer

gst-tacotron
gst-tacotron copied to clipboard