Support `text-to-speech` in `pipeline` function and in Optimum
Feature request
SpeechT5 was recently added to Transformers:
- Blog post: https://huggingface.co/blog/speecht5
- Spaces demo: https://huggingface.co/spaces/Matthijs/speecht5-tts-demo
- Models: https://huggingface.co/mechanicalsea/speecht5-tts
It would be great if text-to-speech could be supported across the Transformers stack.
Motivation
@xenova bumped into this as an issue when trying to get SpeechT5 working in the browser (Transformers.js).
Your contribution
Probably unable to help with this at the moment.
cc @sanchit-gandhi
Indeed, a TTS pipeline would be super helpful to run SpeechT5. We're currently planning on waiting till we have 1-2 more TTS models in the library before pushing ahead with a TTS pipeline, in order to verify that the pipeline is generalisable and gives a benefit over loading a single model + processor.
cc @hollance
Any viable contenders for the other 1-2 models? https://paperswithcode.com/task/text-to-speech-synthesis
Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models
Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models
We can probably just select the most popular models from the hub: https://huggingface.co/models?pipeline_tag=text-to-speech&sort=downloads
There is an open PR for FastSpeech2. I think this is a good new model to add. If anyone is interested in taking that PR to completion, that would be awesome!
Hey, I'd be more than happy to take up this task if we can decide on the other 1-2 models
Let me know if you need any help! Iām excited for this to be added š„
Here's another model which could fall into the text-to-speech category: https://github.com/huggingface/transformers/issues/23036
Just added one more https://github.com/huggingface/transformers/issues/23050
Please add support for the mms-tts model as mentioned in above issue to the TTS pipeline.
Good news! This is currently being worked on: https://github.com/huggingface/transformers/pull/24952 šš„