metavoice-src
metavoice-src copied to clipboard
Please make a simple gradio app that supports text to speech, 0 shot voice cloning, and true training for voice cloning
It shouldn't be hard for you. It can be ugly looking and bad coded, just works is sufficient
Have you tried ttsdemo.themetavoice.xyz ?
On Tue, Feb 6, 2024 at 10:28 PM Furkan Gözükara @.***> wrote:
It shouldn't be hard for you. It can be ugly looking and bad coded, just works is sufficient
— Reply to this email directly, view it on GitHub https://github.com/metavoiceio/metavoice-src/issues/2, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPTUD47ETLA7HEUAYZRNKDYSKVBVAVCNFSM6AAAAABC4ZE75WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZDCNZXGM4TEMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Have you tried ttsdemo.themetavoice.xyz ? … On Tue, Feb 6, 2024 at 10:28 PM Furkan Gözükara @.> wrote: It shouldn't be hard for you. It can be ugly looking and bad coded, just works is sufficient — Reply to this email directly, view it on GitHub <#2>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABPTUD47ETLA7HEUAYZRNKDYSKVBVAVCNFSM6AAAAABC4ZE75WVHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZDCNZXGM4TEMQ . You are receiving this because you are subscribed to this thread.Message ID: @.>
looking nice but i need source code to run locally
by the way 0 shot is bad as i expected
by the way 0 shot is bad as i expected
Can you share more details - what text(s) did you try, and what voice did you use (preset / custom upload) ?
looking nice but i need source code to run locally You can run locally by doing the following:
by the way 0 shot is bad as i expected
Can you share more details - what text(s) did you try, and what voice did you use (preset / custom upload) ?
looking nice but i need source code to run locally You can run locally by doing the following:
hello where is gradio?
i gave this 5 min reference file
http://sndup.net/p2ct
I got much better results with coqui voice cloning
also this is the file it generated with that 5 min reference file
https://sndup.net/r99n/
,I hate that we still cant attach .wav files into github replies
Hi @sidroopdaska, thanks for the amazing project.
I tired zero shot voice cloning with my Indian accent and I could not get the accent right as it sounded more foreign.
Can you please tell more about how to get it right for Indian accents?
Thanks, Rakesh
Hey @INF800, we presently support zero shot voice cloning for American & British speakers only. For an indian accent, you will need to finetune. I would recommend 1-5 mins of your voice + LoRA. Let us know if you need any help on getting started with this implementation
@FurkanGozukara
gradio
https://ttsdemo.themetavoice.xyz/ reference implementation: https://github.com/metavoiceio/metavoice-src/tree/main/fam/ui
could you share the result with xTTS so I can compare?
what do you find lacking in the speech with MetaVoice?
Hey @INF800, we presently support zero shot voice cloning for American & British speakers only. For an indian accent, you will need to finetune. I would recommend 1-5 mins of your voice + LoRA. Let us know if you need any help on getting started with this implementation.
Definitely yes! If you can tell me how to get started it would be helpful.
@sidroopdaska I'd love to train a LORA as well. Please share any relevant pointers on how to get started.
@sidroopdaska I'd love to train a LORA as well. Can't wait to integrate it into our projects. How can I get more help? My email: [email protected]
I've added some initial pointers to this here: https://github.com/metavoiceio/metavoice-src/issues/70#issuecomment-1957337895
Hey @platform-kit / @paliacci /@INF800, we just published an initial approach for finetuning the last N transformer blocks of the first stage LLM. Just a note that it'd be best to play around with the hyperparams in finetune_params.py as we didn't determine optimal params (some people from the community were keen to contribute this portion). Let us know if you have any issues or if you're up for contributing to improving the finetuning (via param sweep or otherwise)!
Next step to improve finetuning effectiveness is to have LoRA adapters for the first stage LLM which is being worked on here.