tts-generation-webui
tts-generation-webui copied to clipboard
Support for SeamlessM4T
This is totally an enhancement, but I wanted to bring the model SeamlessM4T to your attention. The code can be found here. I think this would be a good addition to the other tools in the toolbox.
Thanks, I took a look at it, and I'm a bit confused - are they actually releasing it under CC BY NC 4.0? That's just not open source. I understand if the model weights are CC BY NC 4.0, but if the codebase is that then they are just gaslighting us. I might think and rethink it for a bit, but normally including that code in a project creates massive licensing problems. Hence why it's not open source - just playing with the code can easily be a copyright infringement.
I totally get it. Good catch, I totally didn't notice that before. That sucks, the translation part would have been really cool.
I'll think what I can do. Also, the project seems to be Linux only. So it would need to use a Docker container.
https://github.com/facebookresearch/seamless_communication/issues/38
Also, I can make a Google Colab UI, is there anything in particular you would like to see in the UI?
That would be amazing! I want to say please feel zero pressure to complete this. A colab ui that can use tortoise and seamless would be fantastic. I want to create a free language learning course and before I was going to pay Microsoft to use their SSML text to speech tool. They have options that allow you to use the same voice across different languages. These two combined would make that easier. I was in no rush to complete that, so I thought whenever you would get around to it would be fine.
That would be amazing! I want to say please feel zero pressure to complete this. A colab ui that can use tortoise and seamless would be fantastic. I want to create a free language learning course and before I was going to pay Microsoft to use their SSML text to speech tool. They have options that allow you to use the same voice across different languages. These two combined would make that easier. I was in no rush to complete that, so I thought whenever you would get around to it would be fine.
By the way, if you mean free, but there's any kind of commercial elements to it - advertisements, sponsorships, donations, you might be breaching the CC BY NC 4.0 license. (That's why I honestly highly recommend avoiding it, although always check the applicable laws since many good projects are now limited by it. And the recent popularity of this license is something unseen before.)
Also they moved the issue of changing the license: https://github.com/facebookresearch/seamless_communication/issues/28
Thanks for this update and the recommendation. I doubt they are going to change the license anytime soon. We can forget adding this to the overall application.
There might be some approach to it in the future but from a project perspective it's quite limited. Like, if they want to pretend this is open-source, someone else might create an UI based on this or another project.
@TheWorldEndsWithUs no longer licensed with CC BY NC, now it's a mixed license that at least allows me to include it in this project code wise (although FYI once you download the model weights the "combined software" is CC BY NC in terms fo distribution.)
There's a Seamless M4T Demo available within the Gradio UI. For additional functionality or React UI implementation, please create a feature request.