Kumar Saurav comments

Results 9 comments of


                                            Kumar Saurav

Improving the latency

Hello @sidroopdaska Would love to contribute too. Please reach out to me at [email protected]

Questions about 48k audio file train

Hello @nikich340 I m trying training an 8000Hz with 2 hours of data and changed it in the config file before training but my audio seems like it's mumbling, not...

How many steps should we train to get the best results?

Hello Everyone, I'm training a VCTK dataset (22050 sampling rate), downloaded, for the multi-Speaker model. I have trained for 350000 steps and yet the quality of synthesis is not good...

How many steps should we train to get the best results?

One update, I noticed that in my dataset, there is initial silence in most of the audio files (before getting downsampled), so it remains in 22050Hz data as well. I...

How many steps should we train to get the best results?

@LanglyAdrian yes silence do create some issue but i started getting better results after more epochs.

How many steps should we train to get the best results?

@LanglyAdrian not sure what your problem can be. Can you share your inference code. Also you are passing the speaker ids as per VCTK data in filelist?

How many steps should we train to get the best results?

@Linghuxc It doesnt work like that, i have trained for around 350k steps with a batch size of 16 and got good quality. You can do the same with batch...

hypo.word file missing during MMS ASR inference

Hello Everyone, I have done all the remaining dependencies done. and when i m printing the output of error it shows Cannot Unpack NoneType object. Here is the full log...

Training for custom dataset

You can use [piper](https://github.com/rhasspy/piper) to train and infer for any number of languages (out of Box). We trained a VITS model in Hindi that sounds similar to English. We also...