Regarding the problem of SeamlessM4T translating and cloning timbre/spoken language, here are cases
Regarding using only one audio sample, you can speak multiple languages using the tone of the audio sample. In fact, what you use is: seamless You can refer to this: https://replicate.com/adirik/seamless-expressive He seems to have also quoted seamless: https://github.com/replicate/cog-seamlessexpressive demo:https://www.youtube.com/watch?v=lgL_rCF02Ng You can refer to the recently popular ones: https://github.com/RVC-Boss/GPT-SoVITS
Is GPT-SoVITS a new replacement for RVC?
https://replicate.com/adirik/seamless-expressive
You can upload an audio test, as if translated directly and clone the sound at the same time https://replicate.com/adirik/seamless-expressive
He uses seamless :https://huggingface.co/facebook/seamless-expressive
I don't know why you do this
Is GPT-SoVITS a new replacement for RVC? GPT-SoVITS can also Too complicated to use 。 You only need to upload a piece of audio, and you can use SeamlessExpressive. However, the SeamlessExpressive model needs to be reviewed before it can be obtained. I don’t know what the difference is between it and SeamlessM4T.
He uses seamless :https://huggingface.co/facebook/seamless-expressive
I don't know why you do this
Ah now I remember this seamless-expressive, I got it confused with seamlessM4T. To be honest, I doubt that many people will be willing to fill out the form. GPT-SoVITS and SeamlessM4T can be done though.
I don't know why you do this