Rishikesh (ऋषिकेश) comments

Results 162 comments of


                                            Rishikesh (ऋषिकेश)

Loss exploded???

@WhiteFu if you are using this code then use large (more than 50 hours) expressive dataset like a blizzard for getting a decent result.

@MisakaMikoto96 aware of `Nan` loss, it means your variational autoencoder (VAE) unable to learn the latent representation. This is the common problem when you dealing with Variational autoencoder but the...

Have you tried to remove the input noise using mel instead?

Nope I have started training as per paper, I will do change that in future and will compare the results.

How about the audio quality?

Quality is better than v1 of hifigan with less training

How about the audio quality?

@SolomidHero I will check, but I think audio would be good I have train this model in 4 dataset including LJSpeech and it perform good not as good as mentioned...

How about the audio quality?

We tested it on multiple datasets and it working better than hifigan in speed as well as quality please follow same pre-processing and hyperparameter mentioned in the repo.

comparison with univnet

@thepowerfuldeez Fre-GAN is better than UnivNet

comparison with univnet

I tried on my own dataset it takes 150k itr to generate excellent voice whereas HiFi-GAN usually takes 1 M steps for same quality.

comparison with univnet

It only takes 2 days to reach 150k itr

Time-domain loss

@alexdemartos, I am also skeptic about time domain loss, but we expect muffled or metallic artifacts before discriminator kicks actually that's why we use discriminator to remove those artifacts.