George comments

Results 11 comments of


George

Some questions

That's great 😀 I am trying to adjust meldataset.py right now with your WaveGrad implementation as pointers, I hope I will be able to do it.

Some questions

I thought that too. And you can extract the spectrograms using ExtractTTSpectrogram, so I already have them. I will try that too.

Some questions

Yes, on LibriTTS (or, more preferably, another multispeaker set that is of more universal sound quality). But it'd have to be after my single speaker tests so I can see...

Some questions

I tried to implement Mozilla AP by adding it as a module to the structure, along with the class and changing the getitem function: ``` def __getitem__(self, index): filename =...

Some questions

> > Hi, thanks for sharing the code, it is well appreciated. Some questions: > > > > * Do you train with mean-var normalization? If not, what is the...

Some questions

> Yes. I meant 2,500,000 steps. > However, it synthesizes high-quality audio even at an earlier step. Therefore, it is advisable to adjust the training steps as needed. > I...

Some questions

They do, yes 😀 LJSpeech TTS sounds much more natural after finetuning, so I am holding out hope. Is there any intuition for training to 2.5M steps or was it...

Some questions

> @george-roussos > > We've observed that setting fmax to unlimited value improves the quality in the experiment using the LJ Speech dataset. > This part is expected to be...

Some questions

I cannot share samples, the speaker has not given me consent. It sounds good during eval (when training and finetuning with ground truth extracted using TTS). I do not notice...

Some questions

Sounds good 😀 my speaker has approximately 28 hours of audio and they are very breathy, so it should not be a problem of not enough occurrences. I use an...