MeloTTS
MeloTTS copied to clipboard
share loss images
Hi! Could you share the loss images during training to get an idea of how they should look like? I'm trying to train a new single speaker model but my model cant articulate words at early stages (epoch 500) eventhough the attention matrix looks diagonal. I attach the config file in case it might help. Thanks!!
How many hours did you train on?
In my case, it took about 5~8 hours data to train, but I haven't tried fewer hours.
This is my loss images, training 300 epochs.
Hey, thanks for sharing!
Nevermind, it was a problem with my data. I resampled my 22,5kHz data to 44,1kHz and there where some artifacts in the high frequencies, that was the problem. Changing the frequency to 22,5kHz solved the problem and now it is sounding great!
In case it helps to anybody, I trained with 20 hours data and below I let the generator losses, which are higher than yours but it still sounds great to me.