sherpa-onnx linking error when updating onnxruntime to 1.20.2

Overall I try to speed up kokoro TTS inference using CoreML provider on iOS and I know onnxruntime improved a lot CoreMLExecutionProvider . I tried to update sherpa-onnx to use latest 1.20.2 build for by just updating version in ./build-ios.sh. Previously 1.17.3 compilation and testing worked on my iphone (13 mini, latest iOS). Attaching linking error (I think related to absl?).

I'm on MacOS 15.3.1 and Xcode 16.2

sherpa-compilation-log.txt

Mar 09 '25 15:03 pzoltowski

I suspect there is an error in our onnxruntime 1.20.2 xcframework.

Could you try https://github.com/CocoaPods/Specs/blob/1f7ce5ed8e6460f3192ad4f9791c4ae8023c2139/Specs/3/a/a/onnxruntime-c/1.21.0/onnxruntime-c.podspec.json#L13

Mar 09 '25 16:03 csukuangfj

Just tried: if I only swapped onnxruntime.xcframework without recompiling sherpa-onnx.xcframework and did iphone app uninstall, clean build and rebuild for SherpaOnnxTts sample it still did work on my device. Maybe slight performance improvements on CoreML but expected bigger gain sadly.

Rebuilding sherpa-onnx.xcframework didn't work though - folder structure did changed previously was static library and now there is .framework: ` -- location_onnxruntime_header_dir: /Users/patryk/Downloads/sherpa-onnx-master/build-ios/ios-onnxruntime/onnxruntime.xcframework/Headers CMake Error at cmake/onnxruntime.cmake:163 (message):

/Users/patryk/Downloads/sherpa-onnx-master/build-ios/ios-onnxruntime/onnxruntime.xcframework/ios-arm64/libonnxruntime.a cannot be found Call Stack (most recent call first): CMakeLists.txt:344 (include) `

sherpa-with-onnxruntime-1.21.0-log.txt

I will try still to see if possible to get some performance gain because even on CPU sherpa-onnx on my Macbook m2 max via python testing gets only 2.5x realtime speedup comparing to kokoro-onnx package that gets 6x on CPU provider (and 4.7x on CoreML provider - but on iOS that has weaker CPU but same NPU as this macbook gain should be similar, sadly on iphone 13 mini getting only ~1.1x speedup).

Mar 09 '25 16:03 pzoltowski

FWIW I did a little bit more experimenting to see if I can get some performance gain. First I tried different onnx model from https://huggingface.co/onnx-community/Kokoro-82M-v1.0-ONNX Did simple scaffolding to run already generated tokens throught it via c++ using onnxruntime 1.20.2 in release mode on my iphone 13 mini (iOS 18.2).

I'm getting around 2.8x speedup factor comparing to realtime (audio duration) with full model (fp32) and CPU provider.

Then I tried run that model using sherpa-onnx tts sample (had to massage the model a little bit by adding sherpa model metadata, changing input from inputs_ids to tokens and waveform to audio and changed tensor output shape from float32[1,num_samples] to float32[audio_length]. I also changed onnxruntime.xcframework to 1.20.2 in this case as well (didn't recompile sherpa-onnx thought because some build script error that I shared before). Run also on CPU with thread 1-2 but I'm getting only 1.6x speedup factor.

Do you have any idea why it's significantly slower when running such model via sherpa-onnx framework? Tested only with short sentence:

tokens = [50, 157, 43, 135, 16, 53, 135, 46, 16, 43, 102, 16, 56, 156, 57, 135, 6, 16, 102, 62, 61, 16, 70, 56, 16, 138, 56, 156, 72, 56, 61, 85, 123, 83, 44, 83, 54, 16, 53, 65, 156, 86, 61, 62, 131, 83, 56, 4, 16, 54, 156, 43, 102, 53, 16, 156, 72, 61, 53, 102, 112, 16, 70, 56, 16, 138, 56, 44, 156, 76, 158, 123, 56, 16, 62, 131, 156, 43, 102, 54, 46, 16, 102, 48, 16, 81, 47, 102, 54, 16, 54, 156, 51, 158, 46, 16, 70, 16, 92, 156, 135, 46, 16, 54, 156, 43, 102, 48, 4, 16, 81, 47, 102, 16, 50, 156, 72, 64, 83, 56, 62, 16, 156, 51, 158, 64, 83, 56, 16, 44, 157, 102, 56, 16, 44, 156, 76, 158, 123, 56, 4]

I noticed voices.bin in getTtsFor_kokoro_en_v0_19 are more heavy (6MB) vs af.bin (0.5MB) in hf repo. Could that be the reason? Or are there any other settings in sherpa-onnx I might be not aware of that would require some fine tuning?

I measure performance of only:

sess_->Run({}, input_names_ptr_.data(), inputs.data(), inputs.size(),
                   output_names_ptr_.data(), output_names_ptr_.size());

just to be sure I measure only inference.

FWIW Attaching also model conversion (coppying metadata and modifying input/output):

convert_metadata_kokoro.py.txt

Mar 11 '25 16:03 pzoltowski

sherpa-onnx sherpa-onnx copied to clipboard

linking error when updating onnxruntime to 1.20.2

sherpa-onnx
sherpa-onnx copied to clipboard