Audiovisual-Synthesis More info on training

More info on training

Open ajaiswal1008 opened this issue 4 years ago • 0 comments

I am training the voice conversion model from scratch using the Obama audio file. I have trained for around 20K steps and the loss is not decreasing much (recon0:0.03, recon:0.03, vocoder:0.04). Also the audio file generated after 20K steps sounds like Obama's voice but the content information is lost. Can you advice on what steps should i take going forward? Should i just wait till 60K steps as mentioned in the paper. Also what are the loss values that would indicate a good model performance.

Thanks in advance

Nov 16 '21 18:11 ajaiswal1008

Audiovisual-Synthesis Audiovisual-Synthesis copied to clipboard

More info on training

Audiovisual-Synthesis
Audiovisual-Synthesis copied to clipboard