Shivam Mehta

Results 47 comments of Shivam Mehta

If you are using a single-speaker model, a single-speaker model doesn't support speaker indexes, if you switch to any multispeaker model the command should work just fine! example: `tts --model_name...

I have to update the interface, for now downgrade gradio to => `gradio==3.43.2`

I will keep it open for now to remember to update the interface post the deadline.

They are! I met the author @cantabile-kwok just last week (super nice guy), it is interesting we both made certain decisions to improve the speed relative to just conditional flow...

Thank you very much for your kind words :) >I tried to train it, but it seems to train very slowly I see 0.5 to 1.6 iterations per second. At...

Hello! I have added the source code of this and added documentation around it! Hopefully it will help :) https://github.com/shivammehta25/Matcha-TTS/wiki/Improve-GPU-utilisation-by-extracting-phoneme-alignments Kind Regards, Shivam

Hello! The idea seems great however, I don't think this is an issue from the Matcha-TTS side, seems to be a more PyTorch thing. What I read on this thread...

Yeah, the dataloading warning > /opt/miniconda/envs/emotion/lib/python3.10/site-packages/lightning/pytorch/utilities/data.py:77: Trying to infer the `batch_size` from an ambiguous collection. The batch size we found is 15. To avoid any miscalculations, use `self.log(..., batch_size=batch_size)`. is...

Hello, I haven't heard anything about this in the past few weeks. I am closing it for now feel free to reopen it if something still persists.

I am sorry, I haven't trained a Chinese dataset, but I can assure that the model training is language independent. There are forks in Krygz https://github.com/UlutSoftLLC/MamtilTTS and Catalan https://huggingface.co/projecte-aina/matxa-tts-cat-multiaccent ....