Kumar Saurav
Kumar Saurav
Hello @sidroopdaska Would love to contribute too. Please reach out to me at [email protected]
Hello @nikich340 I m trying training an 8000Hz with 2 hours of data and changed it in the config file before training but my audio seems like it's mumbling, not...
Hello Everyone, I'm training a VCTK dataset (22050 sampling rate), downloaded, for the multi-Speaker model. I have trained for 350000 steps and yet the quality of synthesis is not good...
One update, I noticed that in my dataset, there is initial silence in most of the audio files (before getting downsampled), so it remains in 22050Hz data as well. I...
@LanglyAdrian yes silence do create some issue but i started getting better results after more epochs.
@LanglyAdrian not sure what your problem can be. Can you share your inference code. Also you are passing the speaker ids as per VCTK data in filelist?
@Linghuxc It doesnt work like that, i have trained for around 350k steps with a batch size of 16 and got good quality. You can do the same with batch...
Hello Everyone, I have done all the remaining dependencies done. and when i m printing the output of error it shows Cannot Unpack NoneType object. Here is the full log...
You can use [piper](https://github.com/rhasspy/piper) to train and infer for any number of languages (out of Box). We trained a VITS model in Hindi that sounds similar to English. We also...