FastSpeech2 FastSpeech2 trained using LibriTTS dataset

Hi, my name is Yoonhyung Lee, who is studying Text-to-Speech. Thank you for your nice implementation of FastSpeech2. It helped me a lot to study it, but a question occurred to me.

According to the README.md, it seems that you have trained FastSpeech2 using LibriTTS dataset, but I cannot see the audio samples. Did you use all of the 585hours dataset for the training? How well the FastSpeech work on multispeaker dataset?

May 17 '21 15:05 LEEYOONHYUNG

@LEEYOONHYUNG Oh I just forgot to post the audio samples. I'll update the demo page some other day. Honestly speaking the quality of the synthesized LibriTTS samples is not as good as the result of the single speaker dataset. I guess it is because that the environment noises in the LibriTTS dataset are much severe than the LJSpeech dataset. It might be a good idea to apply some data cleaning tricks before training the TTS model.

May 26 '21 07:05 ming024

I think it is quite natural learning multi-speaker TTS is more difficult. Thank you for your reply :D

May 27 '21 01:05 LEEYOONHYUNG

@ming024 any chance you could post a pretrained model for the multi-speaker English dataset, LibriTTS?

Great work with this repo and thanks in advacne!

Jun 16 '21 17:06 lkurlandski