mlx-audio Add ESpeakNG.xcframework for iOS on Kokoro MLX Audio Swift

Added ESpeakNG.xcframework slice for iOS (real device) based on script from https://github.com/mlalma/kokoro-ios

The framework contains multiple 'slices,' including versions for macOS and iOS. While linking is sufficient for macOS to find and use its slice, iOS has stricter requirements.

The core issue was that the iOS slice of ESpeakNG.xcframework was not being correctly embedded into the Swift-TTS app bundle for the iOS target.

To get it to work:

I adjusted the Runpath Search Paths for the iOS target to the standard @executable_path/Frameworks, which tells iOS where to look for embedded frameworks.
Then, I move the ESpeakNG.xcframework to the main Frameworks directory and set "Embed & Sign" in the "Frameworks, Libraries, and Embedded Content" section to embed in the iOS app.

This path change helps Xcode process the framework for the iOS target, otherwise running into 'dyld: Library not loaded' error."

What do you think of it? Any better alternative?

May 11 '25 18:05 rudrankriyam

Your overall approach looks fine, but we'd want to add the source and project file, not the binary framework, and build it as a dependency of the MLXAudio framework (or example app, etc) in Swift.

In a perfect world it would reference the official code as a submodule, but it looks like the project structure differs quite a bit so I'm not sure how much effort that would be.

May 11 '25 20:05 lucasnewman

Thank you very much @rudrankriyam!

I agree with you @lucasnewman but from my initial testing this is the fastest way to unlock IOS.

Since this is only for Kokoro. What do you think we move forward this approach in the meantime whilst we work on a more robust solution?

May 11 '25 20:05 Blaizzy

I"ve been playing with it today. Got Kokoro running onto my ios device with multiple voice, but there are limitations in kokoro-ios. Token limit (around 500), memory issue are the biggest. I've split up long txt's into chunks and serialize the TTS with kokoro. Pronunciation also is an issue.

May 11 '25 20:05 niklasmato

@niklasmato yeah for the pronunciation part, it mentions about using the different phonemizer

The project uses eSpeak NG as a phonemizer, which is different from what the original Kokoro TTS uses. This can and will cause differences in the output audio.

May 11 '25 20:05 rudrankriyam

Thanks @niklasmato this is great feedback!

Could you please try out this PR and open an issue with some examples if you find this the same limitations.

May 11 '25 20:05 Blaizzy

Thank you very much @rudrankriyam!

I agree with you @lucasnewman but from my initial testing this is the fastest way to unlock IOS.

Since this is only for Kokoro. What do you think we move forward this approach in the meantime whilst we work on a more robust solution?

@lucasnewman what do you think?

May 13 '25 20:05 Blaizzy

@rudrankriyam could you resolve the conflicts?

May 13 '25 20:05 Blaizzy

#138 has added the iOS framework so closing this

May 14 '25 08:05 rudrankriyam