sherpa-onnx TTS WebAssembly for other languages not work

I tried to follow the instruction to build text-to-speech with WebAssembly. https://k2-fsa.github.io/sherpa/onnx/tts/wasm/index.html When I used English language as in instruction. It worked well.

But when I tried to use some models for different languages. https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-mls-medium.tar.bz2

It did not generate correct audio voices. (in the case of single speaker model, it seems to work, but for multiple speakers not work well)

How can I solve this problem?

May 08 '24 23:05 kmpartner

Please tell us what you have done with the German tts model.

For "not work well", could you describe in detail what it means?

May 09 '24 01:05 csukuangfj

Thank you for reply. I just followed documentation page (https://k2-fsa.github.io/sherpa/onnx/tts/wasm/build.html) by changing URL for wget.

Page was successfully displayed, but when I tried to generate German voice from text "Heute ist ein guter Tag. Gestern war ein guter Tag.", it generate strange voices in all Speaker ID I tested (5~6 different ID).

when I used a single speaker model (I do not remember which one), Generated voice was no problem.

May 11 '24 12:05 kmpartner

by changing URL for wget

Could you describe it in detail what you have done?

May 11 '24 14:05 csukuangfj

I tried wget and manually download from models list. wget -q https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-de_DE-mls-medium.tar.bz2

extract downloaded folder copy .onnx, tokens, and espeak-ng-data to asset folder and change .onnx file name to model.onnx.

delete old contents in build-wasm-simd-tts folder

run build-wasm-simd-tts.sh

test page

generated voice length is very long (~20 second) and strange from "Heute ist ein guter Tag. Gestern war ein guter Tag.".

May 12 '24 00:05 kmpartner

Could you switch to another German model?

I just tested it and found that the model cannot produce correct speech. I am deleting it.

May 12 '24 03:05 csukuangfj

By the way, you can try all German tts models at https://huggingface.co/spaces/k2-fsa/text-to-speech

May 12 '24 04:05 csukuangfj

That is no problem. I am testing it. But I want to know why in English case multi-speakers model works, and not works in other languages (I tested French multi-speakers model as well, and it generates strange voices). Which files are wrong to produce strange voices?

May 12 '24 04:05 kmpartner

I tested French multi-speakers model as well, and it generates strange voices

Please tell us the exact model you are using.

please first test the model at https://huggingface.co/spaces/k2-fsa/text-to-speech

May 12 '24 06:05 csukuangfj

I don't remember well, but I think model was https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-piper-fr_FR-mls-medium.tar.bz2

It is possible that vits-piper models that contain "mls(-medium)" not work well in different languages as well.

May 13 '24 00:05 kmpartner

I suggest that you don't use any model including mls in its name. I am deleting this model from sherpa-onnx.

May 13 '24 01:05 csukuangfj

sherpa-onnx sherpa-onnx copied to clipboard

TTS WebAssembly for other languages not work

sherpa-onnx
sherpa-onnx copied to clipboard