tts-generation-webui icon indicating copy to clipboard operation
tts-generation-webui copied to clipboard

Support for SeamlessM4T

Open TheWorldEndsWithUs opened this issue 1 year ago • 9 comments

This is totally an enhancement, but I wanted to bring the model SeamlessM4T to your attention. The code can be found here. I think this would be a good addition to the other tools in the toolbox.

TheWorldEndsWithUs avatar Aug 22 '23 19:08 TheWorldEndsWithUs

Thanks, I took a look at it, and I'm a bit confused - are they actually releasing it under CC BY NC 4.0? That's just not open source. I understand if the model weights are CC BY NC 4.0, but if the codebase is that then they are just gaslighting us. I might think and rethink it for a bit, but normally including that code in a project creates massive licensing problems. Hence why it's not open source - just playing with the code can easily be a copyright infringement.

rsxdalv avatar Aug 22 '23 20:08 rsxdalv

I totally get it. Good catch, I totally didn't notice that before. That sucks, the translation part would have been really cool.

TheWorldEndsWithUs avatar Aug 22 '23 22:08 TheWorldEndsWithUs

I'll think what I can do. Also, the project seems to be Linux only. So it would need to use a Docker container.

rsxdalv avatar Aug 23 '23 05:08 rsxdalv

https://github.com/facebookresearch/seamless_communication/issues/38

Also, I can make a Google Colab UI, is there anything in particular you would like to see in the UI?

rsxdalv avatar Aug 24 '23 05:08 rsxdalv

That would be amazing! I want to say please feel zero pressure to complete this. A colab ui that can use tortoise and seamless would be fantastic. I want to create a free language learning course and before I was going to pay Microsoft to use their SSML text to speech tool. They have options that allow you to use the same voice across different languages. These two combined would make that easier. I was in no rush to complete that, so I thought whenever you would get around to it would be fine.

TheWorldEndsWithUs avatar Aug 24 '23 13:08 TheWorldEndsWithUs

That would be amazing! I want to say please feel zero pressure to complete this. A colab ui that can use tortoise and seamless would be fantastic. I want to create a free language learning course and before I was going to pay Microsoft to use their SSML text to speech tool. They have options that allow you to use the same voice across different languages. These two combined would make that easier. I was in no rush to complete that, so I thought whenever you would get around to it would be fine.

By the way, if you mean free, but there's any kind of commercial elements to it - advertisements, sponsorships, donations, you might be breaching the CC BY NC 4.0 license. (That's why I honestly highly recommend avoiding it, although always check the applicable laws since many good projects are now limited by it. And the recent popularity of this license is something unseen before.)

Also they moved the issue of changing the license: https://github.com/facebookresearch/seamless_communication/issues/28

rsxdalv avatar Sep 21 '23 07:09 rsxdalv

Thanks for this update and the recommendation. I doubt they are going to change the license anytime soon. We can forget adding this to the overall application.

TheWorldEndsWithUs avatar Oct 24 '23 18:10 TheWorldEndsWithUs

There might be some approach to it in the future but from a project perspective it's quite limited. Like, if they want to pretend this is open-source, someone else might create an UI based on this or another project.

rsxdalv avatar Oct 24 '23 18:10 rsxdalv

@TheWorldEndsWithUs no longer licensed with CC BY NC, now it's a mixed license that at least allows me to include it in this project code wise (although FYI once you download the model weights the "combined software" is CC BY NC in terms fo distribution.)

rsxdalv avatar Dec 11 '23 15:12 rsxdalv

There's a Seamless M4T Demo available within the Gradio UI. For additional functionality or React UI implementation, please create a feature request.

localhost_7860_

rsxdalv avatar Jun 21 '24 21:06 rsxdalv