Prince Canuma
Prince Canuma
Ok, got it! I believe this was fixed in #153 but this will be out once I merge #154 :) Also we made some new fixed that will allow it...
Could you share more about how you are using this server? Besides speed what are your current pain points and things you need ?
There is a PR here from @BenLumenDigital
Yes, we do support it. But I believe you want just SigLip, right?
I added Siglip support here: https://github.com/Blaizzy/mlx-embeddings/releases/tag/v0.0.2 Siglip2 is next.
You will be able to use my siglip2 implementation and add the changes needed to support mexma.
@lucasnewman here is a interesting edge case for Dia If you take this prompt it will generate audio with very fast speech. ``` python -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1]...
I noticed that reducing temperature make us slightly slower (+2sec)
Breaking the text into 4 turns max seems to address the speed issue ``` python -m mlx_audio.tts.generate --model mlx-community/Dia-1.6B --text "[S1] Dr. Aris, the AI progress is just breathtaking, isn't...
Thanks Lucas! Indeed make sure sample rate is 44100, because the default 24000 sounds bad. > The model still struggles with excessively long pauses in some situations, especially the ellipsis...