tacotron When I iterate 13,000 times, why is the synthesized speech a piece of silence

When I iterate 13,000 times, why is the synthesized speech a piece of silence

Open Text2-m opened this issue 6 years ago • 4 comments

Apr 19 '19 01:04 Text2-m

It's hard to say without more information, but 13k iterations is probably not enough.

What are you using for training data?
What does your loss curve look like?
Can you the latest alignment image? This should be dumped to the training directory.

Apr 26 '19 01:04 keithito

how many iterations need?

May 21 '19 15:05 vinnitu

May 24 '19 13:05 vinnitu

Me too I have 15k steps for Moilla Dataset. The attention plot seems good but the synthesis produces noise.

step-15000-align

Sep 02 '19 04:09 japita-se