metavoice-src icon indicating copy to clipboard operation
metavoice-src copied to clipboard

Support for long-form synthesis.

Open computersrmyfriends opened this issue 10 months ago • 8 comments

The gradio app displays that "MetaVoice-1B is a 1.2B parameter base model for TTS (text-to-speech). It has been built with the following priorities:

 **Support for long-form synthesis.

image

and on the main page:

image

Is there some way to do it now which is probably not documented?

computersrmyfriends avatar Mar 30 '24 16:03 computersrmyfriends

@computersrmyfriends Long-form support is still in the works, see #53

MethanJess avatar Apr 01 '24 12:04 MethanJess

Is there some way to do it now which is probably not documented?

No, model is built in a way to handle this, but extra code is needed to make it work... we'll release it along with other upcoming features

vatsalaggarwal avatar Apr 02 '24 10:04 vatsalaggarwal

For now, is it possible to split text and generate it?

computersrmyfriends avatar Apr 02 '24 11:04 computersrmyfriends

Yes, you can do that manually... but it will mean that the audio sounds inconsistent across sentences...

vatsalaggarwal avatar Apr 02 '24 11:04 vatsalaggarwal

Still waiting for your long form support. Please let me know if you have any updates

computersrmyfriends avatar May 07 '24 18:05 computersrmyfriends

Thanks for the enthusiasm & patience @computersrmyfriends, we'll post an update here once we have more to talk about 👍

lucapericlp avatar May 14 '24 21:05 lucapericlp

I'm excited for this feature too!

fivestones avatar Jun 01 '24 11:06 fivestones

Hi, is there any update on when this feature will be released? cc @vatsalaggarwal

fakerybakery avatar Jul 25 '24 18:07 fakerybakery