FastSpeech
FastSpeech copied to clipboard

Published 20 hours ago •

Deepest-Project

→

Metadata

Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"

Readme
Issues

FastSpeech

Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"

Training

Set data_path in hparams.py as the LJSpeech folder
Set teacher_dir in hparams.py as the data directory where the alignments and melspectrogram targets are saved
Put checkpoint of the pre-trained transformer-tts (weights of the embedding/encoder layers are used)
python train.py

Training curves (orange: character / blue: phoneme)

The size of the train dataset is different because transformer-tts trained with phoneme shows more diagonal attention

train:val:test=8:1:1, total => character:1126 / phoneme:3412

Training plots (orange: batch_size:64 / blue: batch_size:32)

Audio Samples

You can hear the audio samples here

About

Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"

51

Stars

8

Forks

Watchers

Owner

Deepest-Project

← Metadata

51

Stars

8

Forks

Watchers

Owner

Deepest-Project

Metadata

Implementation of "FastSpeech: Fast, Robust and Controllable Text to Speech"