Fixes and refactoring of Swift library and demo app
I've fixed and refactored the Swift parts of this repo:
- Fixed Orpheus (now ~2.4 RTF on M3)
- Added OuteTTS (~4.3 RTF on M3)
- Minor fixes for Kokoro and Marvis
- Fixed several crashes
- Fixed MLX usage
- Consolidated iOS and macOS apps into one multi-platform app
- Cleaned up the UI
- Used the latest SwiftUI patterns
- Reduced code duplication
- Loading espeak-ng as a package instead of bundling with the package
- Loading model and tokenizer files from Hugging Face instead of bundling with the package
- Separated library files for later publication as package
- Migrated to Swift 6.2 using the latest concurrency patterns for thread safety
The final step will be to move the Swift parts into one or more separate repos. I would suggest separate repos for the library (files currently in mlx_audio_swift/tts/MLXAudio as well as Package.swift) and the demo app, so that the demo app can import the Swift package from GitHub.
For LLMs in Swift we currently have a repo called mlx-swift-lm. Following this pattern, we could name the Swift package repo mlx-swift-audio.
@Blaizzy, we could also download the Kokoro voices from Hugging Face instead of bundling these heavy JSON files if they are uploaded as .safetensors instead of Pickle files.
Well, awesome work! That's a big PR to review
Loading espeak-ng as a package instead of bundling with the package
I have not gone through the changes but we want to move espeak-ng here: https://github.com/Blaizzy/EspeakNG-Swift Reason is because of its licensing and separating it from Marvis. So developers can easily use Marvis without having to use Kokoro or its dependencies
The espeak-ng organization already has this Swift package, which I'm using in this PR: https://github.com/espeak-ng/espeak-ng-spm
Well done @DePasqualeOrg!
@Blaizzy, we could also download the Kokoro voices from Hugging Face instead of bundling these heavy JSON files if they are uploaded as .safetensors instead of Pickle files.
I agree, this makes sense, you can implement it 👍🏾
For LLMs in Swift we currently have a repo called mlx-swift-lm. Following this pattern, we could name the Swift package repo mlx-swift-audio.
Interesting proposal. I don't have strong ideas between mlx-audio-swift or mlx-swift-audio. Either work, but the former seems better from a discoverability point.
The espeak-ng organization already has this Swift package, which I'm using in this PR: https://github.com/espeak-ng/espeak-ng-spm
@rudrankriyam what are your thoughts on this?
Well done @DePasqualeOrg!
@Blaizzy, we could also download the Kokoro voices from Hugging Face instead of bundling these heavy JSON files if they are uploaded as .safetensors instead of Pickle files.
I agree, this makes sense, you can implement it 👍🏾
How would you like to handle the Hugging Face repo with the voices? They're currently in Pickle format, which is not technically safe. For Swift we need .safetensors files.
@rudrankriyam could you handle the voices? If you come across any issues let me know.
@Blaizzy, I added the voices to the Hugging Face repo in .safetensors format here: https://huggingface.co/mlx-community/Kokoro-82M-bf16/discussions/1
This will allow them to be downloaded in the Swift app instead of bundling converted files.
Here's what the multi-platform app currently looks like on macOS and iOS:
The CI build test is failing because I've used some newer Swift syntax that requires iOS 18.4/macOS 15.4 or newer (specifically, Atomic and isolated deinit). These help a lot with resolving concurrency issues.
What do the other maintainers of this repo think: Is it acceptable to require at least last year's versions of iOS and macOS to run this library? By now around 95% of users are running compatible OS versions. My preference is to prioritize code ergonomics rather than supporting old OS versions that a diminishing fraction of users will be running.
If you're okay with this, we should update the CI settings accordingly.
Cc @Blaizzy @lucasnewman @rudrankriyam
I'm planning to do some extensive work on Swift MLX audio tooling and apps over the coming months, which I'll continue in my own repo: https://github.com/DePasqualeOrg/mlx-swift-audio
I've preserved the commit history from this repo for the relevant files there. If you're interested in contributing, let me know so that we can coordinate our efforts.