sherpa-onnx
sherpa-onnx copied to clipboard
Error when running tts model
Update
It was issue with the tokens file, it was invalid. maybe we can improve the error message?
I tried to run tts model on macOS m1 with examples/tts.rs and got this error
➜ sherpa-rs git:(main) ✗ cargo run --example tts --features="tts" -- --text 'שלום, מה שלומך היום?' --output audio.wav --tokens 'tokens.txt' --model 'model_sherpa.onnx' --provider cpu
Compiling sherpa-rs v0.1.5-beta.5 (/Users/user/Documents/sherpa-rs)
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.37s
Running `target/debug/examples/tts --text 'שלום, מה שלומך היום?' --output audio.wav --tokens tokens.txt --model model_sherpa.onnx --provider cpu`
/Users/user/Documents/sherpa-rs/target/debug/build/sherpa-rs-sys-2249181d8eb0cc1d/out/sherpa-onnx/sherpa-onnx/csrc/offline-tts-character-frontend.cc:ReadTokens:68 Duplicated token '. Line ' 176. Existing ID: 174
➜ sherpa-rs git:(main) ✗ cd sys/sherpa-onnx
➜ sherpa-onnx git:(master) git rev-parse HEAD
c0eaf86dbd4b7c842852215d5418e065a64e6190
In addition when using coreml provider I got many other warnings:
log
cargo run --example tts --features="tts" -- --text 'שלום, מה שלומך היום?' --output audio.wav --tokens 'tokens.txt' --model 'model_sherpa.onnx' --provider coreml
Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
Running `target/debug/examples/tts --text 'שלום, מה שלומך היום?' --output audio.wav --tokens tokens.txt --model model_sherpa.onnx --provider coreml`
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.957684 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958148 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958171 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958185 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958206 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958222 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958237 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958305 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958348 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958371 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958389 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958403 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958424 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958440 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958455 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958468 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958491 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958506 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958521 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958533 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958552 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958567 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958582 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958593 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.959 tts[6889:483163] 2024-07-11 20:25:17.959690 [W:onnxruntime:, coreml_execution_provider.cc:104 GetCapability] CoreMLExecutionProvider::GetCapability, number of partitions supported by CoreML: 43 number of nodes in the graph: 2885 number of nodes supported by CoreML: 80
2024-07-11 20:25:18.470 tts[6889:483163] 2024-07-11 20:25:18.470639 [W:onnxruntime:, session_state.cc:1166 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-07-11 20:25:18.470 tts[6889:483163] 2024-07-11 20:25:18.470693 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
/Users/user/Documents/sherpa-rs/target/debug/build/sherpa-rs-sys-2249181d8eb0cc1d/out/sherpa-onnx/sherpa-onnx/csrc/offline-tts-character-frontend.cc:ReadTokens:68 Duplicated token '. Line ' 176. Existing ID: 174
Other model worked:
cargo run --example tts --features="tts" -- --text 'liliana, the most beautiful and lovely assistant of our team!' --output audio.wav --tokens 'tokens.txt' --model 'vits-ljs.onnx' --lexicon lexicon.txt
Note that the failed model is for Hebrew. it wored for me on Windows few days ago.