flowtron icon indicating copy to clipboard operation
flowtron copied to clipboard

Training loss becomes nan when the number of speakers changes

Open Alexey322 opened this issue 3 years ago • 3 comments

Hi. I trained the flowtron on two speakers, for a total of 50 hours, 25 for each. After that, I wanted to train the model for 10 speakers for 20-30 minutes each using the basic checkpoint of the model trained for 50 hours. I changed the number of speakers from 2 to 10 in the config and loaded all weights except for speaker embedding. At first, the error jumped from 10 to 10,000, and then it completely became nan. Probably the problem is that the model adjusts too much to the old speaker embedding, and when it changes, it cannot adapt to the accidentally initialized speaker embedding. @rafaelvalle, please tell me if you have encountered such a problem?

Alexey322 avatar Sep 14 '21 13:09 Alexey322

Me too,

I train with only one speaker, it converged

Then try to adapt for 65 speakers, 15 min each speaker

I follow this

image

Finally I got Nan

image

The question is that I do it right or not, I also try to fine tune as instruction but it reports speaker embedding miss match.

Thank you @rafaelvalle

v-nhandt21 avatar Sep 17 '21 14:09 v-nhandt21

Have you fixed yet? @v-nhandt21

letrongan avatar Oct 27 '21 14:10 letrongan

Have you fixed yet? @v-nhandt21

No, but I found that we can change the number of speaker from the pretrain, we just can replace the embedding layer of one of exist speaker

v-nhandt21 avatar Oct 28 '21 00:10 v-nhandt21