flowtron
flowtron copied to clipboard
Training loss becomes nan when the number of speakers changes
Hi. I trained the flowtron on two speakers, for a total of 50 hours, 25 for each. After that, I wanted to train the model for 10 speakers for 20-30 minutes each using the basic checkpoint of the model trained for 50 hours. I changed the number of speakers from 2 to 10 in the config and loaded all weights except for speaker embedding. At first, the error jumped from 10 to 10,000, and then it completely became nan. Probably the problem is that the model adjusts too much to the old speaker embedding, and when it changes, it cannot adapt to the accidentally initialized speaker embedding. @rafaelvalle, please tell me if you have encountered such a problem?
Me too,
I train with only one speaker, it converged
Then try to adapt for 65 speakers, 15 min each speaker
I follow this
Finally I got Nan
The question is that I do it right or not, I also try to fine tune as instruction but it reports speaker embedding miss match.
Thank you @rafaelvalle
Have you fixed yet? @v-nhandt21
Have you fixed yet? @v-nhandt21
No, but I found that we can change the number of speaker from the pretrain, we just can replace the embedding layer of one of exist speaker