MeloTTS icon indicating copy to clipboard operation
MeloTTS copied to clipboard

share loss images

Open AngelGuevara7 opened this issue 1 year ago • 2 comments

Hi! Could you share the loss images during training to get an idea of how they should look like? I'm trying to train a new single speaker model but my model cant articulate words at early stages (epoch 500) eventhough the attention matrix looks diagonal. I attach the config file in case it might help. Thanks!!

config.json

AngelGuevara7 avatar Mar 13 '24 15:03 AngelGuevara7

How many hours did you train on? In my case, it took about 5~8 hours data to train, but I haven't tried fewer hours. This is my loss images, training 300 epochs. image

jeremy110 avatar Mar 15 '24 01:03 jeremy110

Hey, thanks for sharing! Nevermind, it was a problem with my data. I resampled my 22,5kHz data to 44,1kHz and there where some artifacts in the high frequencies, that was the problem. Changing the frequency to 22,5kHz solved the problem and now it is sounding great! In case it helps to anybody, I trained with 20 hours data and below I let the generator losses, which are higher than yours but it still sounds great to me. g_loss

AngelGuevara7 avatar Mar 18 '24 09:03 AngelGuevara7