TTS-WebUI icon indicating copy to clipboard operation
TTS-WebUI copied to clipboard

Regarding the problem of SeamlessM4T translating and cloning timbre/spoken language, here are cases

Open curui opened this issue 1 year ago • 5 comments

Regarding using only one audio sample, you can speak multiple languages using the tone of the audio sample. In fact, what you use is: seamless You can refer to this: https://replicate.com/adirik/seamless-expressive He seems to have also quoted seamless: https://github.com/replicate/cog-seamlessexpressive demo:https://www.youtube.com/watch?v=lgL_rCF02Ng You can refer to the recently popular ones: https://github.com/RVC-Boss/GPT-SoVITS

curui avatar Mar 14 '24 21:03 curui

Is GPT-SoVITS a new replacement for RVC?

rsxdalv avatar Mar 14 '24 21:03 rsxdalv

https://replicate.com/adirik/seamless-expressive

You can upload an audio test, as if translated directly and clone the sound at the same time https://replicate.com/adirik/seamless-expressive

curui avatar Mar 14 '24 22:03 curui

He uses seamless :https://huggingface.co/facebook/seamless-expressive
屏幕截图 2024-03-15 060657 I don't know why you do this

curui avatar Mar 14 '24 22:03 curui

Is GPT-SoVITS a new replacement for RVC? GPT-SoVITS can also Too complicated to use 。 You only need to upload a piece of audio, and you can use SeamlessExpressive. However, the SeamlessExpressive model needs to be reviewed before it can be obtained. I don’t know what the difference is between it and SeamlessM4T.

curui avatar Mar 14 '24 22:03 curui

He uses seamless :https://huggingface.co/facebook/seamless-expressive 屏幕截图 2024-03-15 060657 I don't know why you do this

Ah now I remember this seamless-expressive, I got it confused with seamlessM4T. To be honest, I doubt that many people will be willing to fill out the form. GPT-SoVITS and SeamlessM4T can be done though.

rsxdalv avatar Mar 15 '24 10:03 rsxdalv