Kokoro with all supported languages and voices + Orpheus added to API and UI
/voices API added to get list of Kokoro voices and filter them by language for the frontend.
Closes #29 and #30
This is great!
I was thinking about the same but for all models.
Because Orpheus has serval voices as well.
It's a great idea! Adding Orpheus model and voices right now 🚀
Done and ready for review @Blaizzy 🚀
@Blaizzy I tested all Orpheus voices 1 by 1, some of them are not working. Tara, Zac e Zoe create long audio with empty parts or prolonged audio. Even with generate from command line. Give them a try.
Hey Ivan
Yes, you are right! I noticed the same.
I would remove those voices for now. Add some comments and we can revisit them later.
We can try to add back all voices after #68
Closed by mistake, working on it.
No worries, let me know when you ready :)
Ok @Blaizzy ready to go. Orpheus was fixed at 15 seconds of audio. I changed logic to be able to split text in multiple ways. Everything seems good to me:
- All voices and languages added for Orpheus
- Longer audio generation in Orpheus
@Blaizzy ready!
@lucasnewman could you please check the sesame changes and see if anything stands out?
I noticed that the generate doesn't process list of prompts like Kokoro (pipeline) and Orpheus.
Initially I thought of enforcing all models to use a pipeline that would serve to handle list of inputs, but for Orpheus I just keep the idea inside generate because since it's an LLM, the pipeline code was just gonna be a few of code .
@lucasnewman could you please check the sesame changes and see if anything stands out?
Looks fine to me apart from your comments.
I noticed that the generate doesn't process list of prompts like Kokoro (pipeline) and Orpheus.
Initially I thought of enforcing all models to use a
pipelinethat would serve to handle list of inputs, but for Orpheus I just keep the idea insidegeneratebecause since it's an LLM, thepipelinecode was just gonna be a few of code .
Yeah, I personally prefer the simplest approach and lighter abstraction. I think it's reasonable to have every generate() implementation take either a string or list of strings though, since sentence splitting is so common / useful.