tacotron
tacotron copied to clipboard
When I iterate 13,000 times, why is the synthesized speech a piece of silence
It's hard to say without more information, but 13k iterations is probably not enough.
- What are you using for training data?
- What does your loss curve look like?
- Can you the latest alignment image? This should be dumped to the training directory.
how many iterations need?
It is normal? e78123b6-5aca-4233-9472-ded968904295.zip step-12000-audio.zip
Me too I have 15k steps for Moilla Dataset. The attention plot seems good but the synthesis produces noise.
