Add ESpeakNG.xcframework for iOS on Kokoro MLX Audio Swift
Added ESpeakNG.xcframework slice for iOS (real device) based on script from https://github.com/mlalma/kokoro-ios
The framework contains multiple 'slices,' including versions for macOS and iOS. While linking is sufficient for macOS to find and use its slice, iOS has stricter requirements.
The core issue was that the iOS slice of ESpeakNG.xcframework was not being correctly embedded into the Swift-TTS app bundle for the iOS target.
To get it to work:
- I adjusted the Runpath Search Paths for the iOS target to the standard
@executable_path/Frameworks, which tells iOS where to look for embedded frameworks. - Then, I move the
ESpeakNG.xcframeworkto the mainFrameworksdirectory and set "Embed & Sign" in the "Frameworks, Libraries, and Embedded Content" section to embed in the iOS app.
This path change helps Xcode process the framework for the iOS target, otherwise running into 'dyld: Library not loaded' error."
What do you think of it? Any better alternative?
Your overall approach looks fine, but we'd want to add the source and project file, not the binary framework, and build it as a dependency of the MLXAudio framework (or example app, etc) in Swift.
In a perfect world it would reference the official code as a submodule, but it looks like the project structure differs quite a bit so I'm not sure how much effort that would be.
Thank you very much @rudrankriyam!
I agree with you @lucasnewman but from my initial testing this is the fastest way to unlock IOS.
Since this is only for Kokoro. What do you think we move forward this approach in the meantime whilst we work on a more robust solution?
I"ve been playing with it today. Got Kokoro running onto my ios device with multiple voice, but there are limitations in kokoro-ios. Token limit (around 500), memory issue are the biggest. I've split up long txt's into chunks and serialize the TTS with kokoro. Pronunciation also is an issue.
@niklasmato yeah for the pronunciation part, it mentions about using the different phonemizer
The project uses eSpeak NG as a phonemizer, which is different from what the original Kokoro TTS uses. This can and will cause differences in the output audio.
Thanks @niklasmato this is great feedback!
Could you please try out this PR and open an issue with some examples if you find this the same limitations.
Thank you very much @rudrankriyam!
I agree with you @lucasnewman but from my initial testing this is the fastest way to unlock IOS.
Since this is only for Kokoro. What do you think we move forward this approach in the meantime whilst we work on a more robust solution?
@lucasnewman what do you think?
@rudrankriyam could you resolve the conflicts?
#138 has added the iOS framework so closing this