sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

Error when running tts model

Open thewh1teagle opened this issue 1 year ago • 2 comments

Update

It was issue with the tokens file, it was invalid. maybe we can improve the error message?


I tried to run tts model on macOS m1 with examples/tts.rs and got this error

➜  sherpa-rs git:(main) ✗ cargo run --example tts --features="tts" -- --text 'שלום, מה שלומך היום?' --output audio.wav --tokens 'tokens.txt' --model 'model_sherpa.onnx' --provider cpu
   Compiling sherpa-rs v0.1.5-beta.5 (/Users/user/Documents/sherpa-rs)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.37s
     Running `target/debug/examples/tts --text 'שלום, מה שלומך היום?' --output audio.wav --tokens tokens.txt --model model_sherpa.onnx --provider cpu`
/Users/user/Documents/sherpa-rs/target/debug/build/sherpa-rs-sys-2249181d8eb0cc1d/out/sherpa-onnx/sherpa-onnx/csrc/offline-tts-character-frontend.cc:ReadTokens:68 Duplicated token '. Line ' 176. Existing ID: 174
➜  sherpa-rs git:(main) ✗ cd sys/sherpa-onnx                                                                                                    
➜  sherpa-onnx git:(master) git rev-parse HEAD

c0eaf86dbd4b7c842852215d5418e065a64e6190

In addition when using coreml provider I got many other warnings:

log
cargo run --example tts --features="tts" -- --text 'שלום, מה שלומך היום?' --output audio.wav --tokens 'tokens.txt' --model 'model_sherpa.onnx' --provider coreml
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.05s
     Running `target/debug/examples/tts --text 'שלום, מה שלומך היום?' --output audio.wav --tokens tokens.txt --model model_sherpa.onnx --provider coreml`
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.957684 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958148 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958171 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958185 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958206 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958222 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958237 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958305 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958348 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958371 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958389 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958403 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958424 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958440 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958455 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958468 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958491 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958506 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958521 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958533 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958552 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958567 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958582 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_output_0, shape: {0}
2024-07-11 20:25:17.958 tts[6889:483163] 2024-07-11 20:25:17.958593 [W:onnxruntime:, helper.cc:93 IsInputSupported] CoreML does not support shapes with dimension values of 0. Input:/model/text_encoder/encoder/layers.0/attention/ConstantOfShape_2_output_0, shape: {0}
2024-07-11 20:25:17.959 tts[6889:483163] 2024-07-11 20:25:17.959690 [W:onnxruntime:, coreml_execution_provider.cc:104 GetCapability] CoreMLExecutionProvider::GetCapability, number of partitions supported by CoreML: 43 number of nodes in the graph: 2885 number of nodes supported by CoreML: 80
2024-07-11 20:25:18.470 tts[6889:483163] 2024-07-11 20:25:18.470639 [W:onnxruntime:, session_state.cc:1166 VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-07-11 20:25:18.470 tts[6889:483163] 2024-07-11 20:25:18.470693 [W:onnxruntime:, session_state.cc:1168 VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
/Users/user/Documents/sherpa-rs/target/debug/build/sherpa-rs-sys-2249181d8eb0cc1d/out/sherpa-onnx/sherpa-onnx/csrc/offline-tts-character-frontend.cc:ReadTokens:68 Duplicated token '. Line ' 176. Existing ID: 174

Other model worked:

cargo run --example tts --features="tts" -- --text 'liliana, the most beautiful and lovely assistant of our team!' --output audio.wav --tokens 'tokens.txt' --model 'vits-ljs.onnx' --lexicon lexicon.txt

Note that the failed model is for Hebrew. it wored for me on Windows few days ago.

thewh1teagle avatar Jul 11 '24 17:07 thewh1teagle