tensorflow-wavenet icon indicating copy to clipboard operation
tensorflow-wavenet copied to clipboard

Is it possible to give condition for any utterances?

Open markov01 opened this issue 8 years ago • 2 comments

Hi all,

There's something I'm confused with 'how to make meaningful sentences with different voices'.

From Deepmind blog, we can hear the samples of meaningful speech with different voices. So, is it done by conditioning the WaveNet somehow both locally and globally or it just conditioned only globally, on a meaningful sentence with different voices?

markov01 avatar Apr 26 '17 03:04 markov01

They condition globally to select which speaker is used, and locally to feed linguistic features from the text, coming from a TTS frontend. See, for example, #235.

lemonzi avatar May 22 '17 22:05 lemonzi

They condition globally to select which speaker is used, and locally to feed linguistic features from the text, coming from a TTS frontend.

@lemonzi How to decide whether it is local condtioning or global conditioning? Instead of feeding linguistic feature, we feed features via spectrogram, how is local conditioning decided?

SatyamKumarr avatar Dec 20 '18 09:12 SatyamKumarr