Keith Ito comments

Results 12 comments of


                                            Keith Ito

Dataset for TTS

One of the datasets they used in the Deep Voice 2 paper was VCTK, which can be downloaded [here](http://homepages.inf.ed.ac.uk/jyamagis/page3/page58/page58.html). It's distributed under the ODC attribution license.

When I iterate 13,000 times, why is the synthesized speech a piece of silence

It's hard to say without more information, but 13k iterations is probably not enough. - What are you using for training data? - What does your loss curve look like?...

Nancy Corpus pre-trained

@MXGray: would you be willing to share your pre-trained model on the Nancy corpus?

Nancy Corpus pre-trained

@navidnadery I'm not sure why this would happen. Maybe there's a lack of long sentences in your training data? You can also try with [Location Sensitive Attention](https://github.com/keithito/tacotron/blob/tacotron2-work-in-progress/models/attention.py) (or hybrid attention)...

Using Griffin Lim without Tensorflow?

There's an example in the training code: https://github.com/keithito/tacotron/blob/a4f5ac3dfc596425206235d931e907b639a60ed4/train.py#L113

How to improve text input notation(Question)

I think you have a few options: 1. Collect some training data with words spoken at different speeds, annotate the words with the speed, and train a model on that....

change grinffin-lim algorithm to wavenet-based vocoder

@toannhu Yes, that repo looks great! I'm training right now on LJ Speech. There's some more discussion over at https://github.com/keithito/tacotron/issues/90

441k pretrain LJs model

Can you attach some examples, and the command line you're using to continue training?

A great improvement has been made for master branch (LJSpeech)

Nice work!

A great improvement has been made for master branch (LJSpeech)

@begeekmyfriend Yes, if you send over a PR, I would be happy to review and merge it.