Seungju

Results 8 comments of Seungju

I've also tried with our dataset and it shows that it was able to generalize on unseen speakers. An interesting part was even I trained the vocoder with Korean dataset,...

Most of samples have similar quality as samples from 6400 epochs, however I figured out that vocoder was vulnerable at background noise(such as clapping sound)

Thanks for your reply! I guessed that strange artifacts like below happens because of those hyperparameters. Don't you have seen those artifacts? I got those artifacts mainly on the front...

Well, I was training new model from scratch using Korean speech data corpus. It has 300 hours amount of various speakers' utterances, and I was getting those artifacts after I...

@seungwonpark Sorry but I couldn't find the note in the original paper that batch size was carefully chosen. Also, I've thinking that if we use multi-speaker training scheme and use...

Is it obvious that mel-gan works best at batch size 16? I reminded the mention of authors and now it sounds like they realize there are trade-offs between audio fidelity...

I also experienced the pronunciation problem. My case was worse since the pronunciation significantly get degraded even for long inputs. Have you solve this?

No, I didn't encounter that error. Can you give me more context?