sherpa-onnx
sherpa-onnx copied to clipboard
Piper Phonemizer Wasm
I am trying to run the onnx model directly with onnx-web-runtime and I was able to. But I am finding it difficult to compile the piper-phonemizer for wasm. Can you share the script for it or a compiled Wasm version which generates phonemeId from text.
I am using a fork of piper-phonemizer and espeak-ng, please see
- https://github.com/k2-fsa/sherpa-onnx/blob/master/cmake/espeak-ng-for-piper.cmake
- https://github.com/k2-fsa/sherpa-onnx/blob/master/cmake/piper-phonemize.cmake
Please find the scripts for building wasm at
- https://github.com/k2-fsa/sherpa-onnx/blob/master/build-wasm-simd-tts.sh
By the way, you can use sherpa-onnx for wasm with piper models directly.
Please see our doc at https://k2-fsa.github.io/sherpa/onnx/tts/wasm/index.html
We have done everything for you about using piper with wasm. You don't need to write your own. It supports all models from piper.
Thanks Fangjun. Will check those build scripts again.
I started sherpa-onnx builds and it was surprisingly easy to swap the models and compile to new bundles with diff models. But my application requires a very low generation time, so I was just batching the tokenIds and generating the audio parallel with a window approach, as I am concerned about playing and not exporting the audio.
With the phonemizer bundled in sherpa onnx I would need to batch based on words to attempt something like this, but I guess the pauses between will be more artificial in word based approach.