sherpa-onnx icon indicating copy to clipboard operation
sherpa-onnx copied to clipboard

Piper Phonemizer Wasm

Open iAmIlluminati opened this issue 1 year ago • 2 comments
trafficstars

I am trying to run the onnx model directly with onnx-web-runtime and I was able to. But I am finding it difficult to compile the piper-phonemizer for wasm. Can you share the script for it or a compiled Wasm version which generates phonemeId from text.

iAmIlluminati avatar Apr 02 '24 21:04 iAmIlluminati

I am using a fork of piper-phonemizer and espeak-ng, please see

  • https://github.com/k2-fsa/sherpa-onnx/blob/master/cmake/espeak-ng-for-piper.cmake
  • https://github.com/k2-fsa/sherpa-onnx/blob/master/cmake/piper-phonemize.cmake

Please find the scripts for building wasm at

  • https://github.com/k2-fsa/sherpa-onnx/blob/master/build-wasm-simd-tts.sh

By the way, you can use sherpa-onnx for wasm with piper models directly.

Please see our doc at https://k2-fsa.github.io/sherpa/onnx/tts/wasm/index.html

We have done everything for you about using piper with wasm. You don't need to write your own. It supports all models from piper.

csukuangfj avatar Apr 03 '24 03:04 csukuangfj

Thanks Fangjun. Will check those build scripts again.

I started sherpa-onnx builds and it was surprisingly easy to swap the models and compile to new bundles with diff models. But my application requires a very low generation time, so I was just batching the tokenIds and generating the audio parallel with a window approach, as I am concerned about playing and not exporting the audio.

With the phonemizer bundled in sherpa onnx I would need to batch based on words to attempt something like this, but I guess the pauses between will be more artificial in word based approach.

iAmIlluminati avatar Apr 03 '24 04:04 iAmIlluminati