Christian Schäfer
Christian Schäfer
Hi, did you check if your tacotron was trained correctly before extracting durations, i.e. a diagonal attention alignment? If the problem occurs in training I am assuming that the extracted...
Hi, for sure it will help to trim the silences. Is the above plot teacher forced (gta)? Only the gta alignment is important. It may help also to play around...
Also, is the phonemizer doing a good job at hindi?
Hi, that actually looks pretty ok to me. Could you check the numpy files in the /data/alg folder? I suspect there might be a fishy broken file with zero durations...
EDIT: I am seeing in the plot that at the first step the attention seems all over the place, that could mess with the duration extraction (there is a simple...
Hi, yes these are messed up durations and definitely cause trouble. You could either remove the ones with first entries zero or try to redo the duration extraction with leading...
Yeah these are messed up. You could either try to remove those or redo the duration extraction with cutted silences. On Thu, 27 Aug 2020, 14:54 Prajwal Rao, wrote: >...
Yeah increasing batch size could help. I didn't look into multi gpu training yet as I don't have the hardware to benefit from it, imo it should be pretty straightforward...
Looks better. Does training work now with the forward tacotron? In the sample it looks though as if there is also some trailing silence left which probably produces some large...
Hi, 500K is plenty. You could take some of the saved models and generate audio files (maybe together with a vocoder), then manually select the best model - its hard...