epub_to_audiobook icon indicating copy to clipboard operation
epub_to_audiobook copied to clipboard

Add support for Coqui TTS

Open Cabeda opened this issue 10 months ago • 4 comments

As the title says this PR adds a new provider supporting the Coqui TTS.

The default model, Tacotron2, works very similar to EdgeTTS although it only has a single voice option for now. The power of this provider is the possibility of supporting multiple open TTS models with some very powerful like jenny.

Another interesting feature is voice dubbing with the likes of XTTS V2. There's a bug on sentences longer than 400 tokens for now though. To support voice dubbing I've added a folder with 3 voice samples and defaulted to the male one. Additionally, in this mode multiple languages are supported. As the options are different than the ones on --language I've added a new option named --coqui_language.

For this version the provider supports the same audio formats as edgeTTS thanks to pydub.

Note: To run coqui TTS it will always download the AI model to run. This can go from a few MB to more than 1 GB

Cabeda avatar Apr 02 '24 12:04 Cabeda

@p0n1 do you have time to give your thoughts on this PR?

Cabeda avatar Apr 04 '24 09:04 Cabeda

@p0n1 do you have time to give your thoughts on this PR?

Hi @Cabeda Thank you for the great work. I just had a surgery and am still recovering at hospital. Will review the code whenever I feel better.

p0n1 avatar Apr 04 '24 09:04 p0n1

No probs! Hope for the best 💪🏼

Cabeda avatar Apr 04 '24 10:04 Cabeda

Have you tried building the Docker image from the docker file using this? I checked out your repository but apparently its missing gcc and the rust compiler. I think another image is needed to install TTS in the docker image

kelvin-homann avatar Apr 12 '24 15:04 kelvin-homann