voice-conversion icon indicating copy to clipboard operation
voice-conversion copied to clipboard

Hi,i got nan loss the same as you

Open jiqizaisikao opened this issue 7 years ago • 4 comments

I implemented the code with some other source codes,Im sure that the parts are right beacase i checkd them Independently,but i got the same nan loss as you,when trained 100 iters about,the loss became bigger and bigger and ended with an err,have you found the reason yourself?

jiqizaisikao avatar Jan 29 '18 02:01 jiqizaisikao

Yes, if you're talking about the vq-vae approach than I also got nan loss after around 20-100 iterations. I am still not sure what the problem is. I tried removing the VQ part of the VAE to make it very similar to the NSynth approach here https://github.com/tensorflow/magenta/tree/master/magenta/models/nsynth but it still gave NaN loss. It must have to do with using voice samples instead of short, simple sound instrument samples. Now that you have reminded me, hopefully I'll have time this week to try again to get it to work. I'm working on another repo on just the VQ-VAE code stuff. https://github.com/ASzot/vq-vae-audio

Let me know if you're able to make any progress.

ASzot avatar Jan 29 '18 03:01 ASzot

It seems that when set the vq-vae commitment_loss coefficient bigger and the learning rate smaller, it will work better,Now im training it and also have to change the code to get some result.

jiqizaisikao avatar Jan 29 '18 04:01 jiqizaisikao

So you were able to get the model to train without getting NaN for the loss?

ASzot avatar Jan 29 '18 21:01 ASzot

It works but it seems that the result is wrong,the result sound like human voice,the quality is somewhat good but not relevant with input,maybe as the paper says: Note that the decoder could completely ignore the de- terministic encoding and degenerate to a standard un- conditioned WaveNet. However, because the encoding is a strong signal for the supervised output, the model learns to utilize it i set beta=5 learning rate=1e-4 batches=2 trainning 30k iters on cmu arctic dataset, but the vq loss is much bigger than that when i tested on mnist dataset,i donot know why, now i change the code to retrain it

jiqizaisikao avatar Jan 30 '18 01:01 jiqizaisikao