FastSpeech2 icon indicating copy to clipboard operation
FastSpeech2 copied to clipboard

Tensorboard ? :D

Open dathudeptrai opened this issue 5 years ago • 7 comments

Do you have any tensorboard :D, an audio samples sound good except background noise, maybe training longer will solve this problem haha :D, great job :D.

dathudeptrai avatar Jul 05 '20 16:07 dathudeptrai

Tensorboard

rishikksh20 avatar Jul 05 '20 18:07 rishikksh20

Tensorboard

rishikksh20 avatar Jul 05 '20 18:07 rishikksh20

@dathudeptrai Currently I am using raw pitch and energy with MSE that's why error looks so high but if required I will standardize or normalize in future. Pitch and energy both seem to be static after 10k steps and going towards over-fitting.

rishikksh20 avatar Jul 05 '20 18:07 rishikksh20

@rishikksh20 is the l1 loss use masked ? . I use the same preprocessing as ESPNET but my valid l1 loss is around < 0.3.

dathudeptrai avatar Jul 06 '20 07:07 dathudeptrai

@dathudeptrai I am using l1 loss masked actually l1_loss is combined loss of before and after Postnet l1 loss that's why it's very, although before and after l1 losses are also very high around 0.47 it should be around < 0.3 I don't know why is this happening whereas the quality of generated audio and spectrogram seems fine. And I am not using ESPNet pre-processing actually I am using Nvidia's pre-processing for simplicity and easy integration with Waveglow and my own implementation of Melgan. You can look at the code especially fastspeech.py, if you find any irregularities in implementation, please let me know.

rishikksh20 avatar Jul 06 '20 07:07 rishikksh20

@rishikksh20 is nvidia norm from 0->4 ?

dathudeptrai avatar Jul 06 '20 07:07 dathudeptrai

@dathudeptrai I don't think so https://github.com/rishikksh20/FastSpeech2/blob/5bc2b402a237ed57e236c3a75d19964cf0f71987/utils/stft.py#L161 they are using spectral normalization: https://github.com/rishikksh20/FastSpeech2/blob/5bc2b402a237ed57e236c3a75d19964cf0f71987/utils/stft.py#L153

rishikksh20 avatar Jul 06 '20 08:07 rishikksh20