[New model] Orpheus-TTS: steerable model, Apache Licensed

Open phirsch opened this issue 8 months ago • 1 comments

Orpheus-TTS is a relatively new, steerable model with a permissive license (Apache 2.0) and impressive performance. It supports disfluencies (like sighs, laughter, etc.) and voice changes via inline tags.

https://canopylabs.ai/model-releases

As of 2025-04-05 it is #4 on this TTS leaderboard: https://huggingface.co/spaces/Pendrokar/TTS-Spaces-Arena

The model is supported via llama.cpp, and converted GGUF files are available on HuggingFace. I haven't seen any single-process implementation yet, but that should be doable. (E.g. RealtimeTTS uses it via llama.cpp's server.)

Apr 05 '25 05:04 phirsch

Very interesting. I will definitely look into it.

Apr 06 '25 17:04 mkiol