FYI: Kokoro TTS iOS port
Just wanted to share that I did a proof-of-concept of porting Kokoro to iOS devices. I took the MLX Python code and ported it to MLX Swift, and the model successfully runs on iOS devices (tested it on iPhone 15 and iPad Air 5th Gen).
The current version is very slow and there is a lot to improve feature-wise and performance-wise, but it is a start. Most importantly it generates the audio even though it is slow. Thanks for your amazing work to port the model from PyTorch to MLX!
Awesome, I will play with it. Been trying to optimize Kokoro for iOS as well. So far only played mostly with:
https://github.com/k2-fsa/sherpa-onnx
on my iphone 13 mini I'm getting there only 1.3x RTF speedup factor (if either using CPU or CoreML provider)
Also did try
huggingface.co/onnx-community/kokoro-82m-v1.0-onnx
using onnxruntime-c with simple bindings to swift and here getting around 2.7x RTF speedup.
I did try to export pytorch model to CoreML but failed - something tell me for iOS devices NPU provider might be in the end faster than MLX that executes only on GPU or CPU (since iPhones don't have many GPU cores)
I'm wondering what kind of speedup do you get right now on your iPhone 15?
If you have a chance please take a look on the update I just pushed to the repo - I am getting on release build roughly 3.3x speed (audio_length / tts_inference_time) on iPhone 13 Pro after some heavy optimizations to decoder part. Still more to be made, but at least it is right now quite usable.
Just wanted to share that I did a proof-of-concept of porting Kokoro to iOS devices. I took the MLX Python code and ported it to MLX Swift, and the model successfully runs on iOS devices (tested it on iPhone 15 and iPad Air 5th Gen).
Hey guys
Thanks for the patience, I have been heads down bug hunting and portting new models!
I saw this issue, it's super exciting and aligned with the vision here.
Could you send a PR and add a folder for the swift port?
That way we have one source of truth and it's easier to collaborate.
I was working towards MLX-Swift, tho swift it's not my strong suit.
I believe there is a lot of potential.