deep-voice-conversion did anyone finish sequence to sequence attention training?

did anyone finish sequence to sequence attention training?

Open benlaitang opened this issue 6 years ago • 2 comments

I write this referencing by https://github.com/keithito/tacotron, but it does not work. the ground truth mel-spectrogram as input can work, but predicted mel failed. Can anyone give me advises?

Dec 17 '18 08:12 benlaitang

I also have this issue. The audios from validation process sound great, while in testing process, the predicted mel spec rather than the ground truth will be input into the next time-step's pre-net, which leads to quite abnormal generated audios. I found the alignment images were not in diagonal shape. It proves the attention mechanism hasn't been learned well. However i don't know how to adjust the model or the training strategy.

Jan 31 '19 03:01 MorganCZY

Yes, any clues on the Seq2Seq+Attention in this network will be great! Please update if anyone gets any solution. Thanks!

Jan 31 '19 05:01 wishvivek

deep-voice-conversion deep-voice-conversion copied to clipboard

did anyone finish sequence to sequence attention training?

deep-voice-conversion
deep-voice-conversion copied to clipboard