Francesco comments

Results 87 comments of


                                            Francesco

duration not predicted correctly

Hi, when producing the mels for WaveRNN (assuming you do want to use the predicted rather than the ground truth ones), you could do a validation step, using the ground...

Issues replicating the examples

Hi, did you use a pretrained model? Which version of the repo are you using (which commit)? It might be samples from an older model file than the most recent...

Issues replicating the examples

Also, if you're interested in replicating the results using our pretrained models, you can just try the Colab Notebooks.

Issues replicating the examples

Sounds fine to me. This is inverted with Griffin-Lim algo, sound quality is expected to be low. You need to follow the next steps in the notebook and convert it...

Shapes in forward model does not match

Hi, is it possible that you trained the autoregressive model up to a reduction factor of more than 1? (in your settings for less than 250K steps)

Unsuitable location for the new_adam static method

Hi, concerning the hard-coded parameters: we did not experiment yet with other parameters, as they are quite constant throughout the literature. So unless there is evidence that they are significantly...

Any Suggestions to introduce pauses (Up or down) in the produced speech?

Hi, yes you will want to train a forward model for this. There you can easily directly control the duration of each phoneme

Regarding mel start and end token

Hi @bkumardevan07 if you start with r=1 you most likely will not get the alignment between text and audio. You can observe this in tensorboard in the last layer: if...

Audio Alignment

What do you mean exactly with aligning the audios? With the script extract_durations.py you will generate a dataset for the forward model using the predictions of the autoregressive model. If...

Audio Alignment

Hi, to evaluate you autoregressive model FOR the alignment extraction, you have to look at the last layer attention heads of your TRAINING SET. If these do not show significant...