piper issue running binary and python on windows

Hi thanks for latest windows release.

I have tried to download the release and run in cmd. -h gives the help menu but nothing happen when i run this command. i tried downloading the model, point towards it but the cmd does not have any response.

echo 'Welcome to the world of speech synthesis!' | .\piper.exe --model en_US-lessac-medium --output_file welcome.wav

I tried running with python too, installing it in conda virtual environment and face this issue too.

ERROR: Cannot install piper-tts==1.1.0 and piper-tts==1.2.0 because these package versions have conflicting dependencies.

The conflict is caused by: piper-tts 1.2.0 depends on piper-phonemize~=1.1.0 piper-tts 1.1.0 depends on piper-phonemize~=1.0.0

Any help will be greatly appreciated. Thanks!

Nov 15 '23 10:11 jessicaliew1024

--model en_US-lessac-medium this should be a path to an ONNX model

Nov 15 '23 16:11 synesthesiam

There is no piper-phonemize wheel for windows: https://pypi.org/project/piper-phonemize/#files

Nov 16 '23 09:11 Ajaja

@synesthesiam Non-Python version works great in Windows. Thank you, very much! Are there plans to add --cuda option? I've tried to use onnxruntime-win-x64-gpu library from https://github.com/microsoft/onnxruntime/releases with CUDA libraries (cublas64_11.dll, cublasLt64_11.dll) from developer.nvidia.com but looks like the CUDA execution provider for onnxruntime should be activated in piper code.

Nov 16 '23 17:11 Ajaja

I was able to build piper-phonemize wheel for Windows and use piper with python and --cuda:

E:\test>echo 'Welcome to the world of speech synthesis!'   | E:\Python310\python.exe -m piper  --cuda --model en_US-lessac-medium --debug --output_file welcome.wav
DEBUG:__main__:Namespace(model='en_US-lessac-medium', config=None, output_file='welcome.wav', output_dir=None, output_raw=False, speaker=None, length_scale=None, noise_scale=None, noise_w=No
ne, cuda=True, sentence_silence=0.0, data_dir=['E:\\test'], download_dir=None, update_voices=False, debug=True)
DEBUG:piper.download:Loading E:\Python310\lib\site-packages\piper\voices.json
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx.json
WARNING:piper.download:Wrong size (expected=7010, actual=4885) for E:\test\en_US-lessac-medium.onnx.json
DEBUG:piper.download:Downloading https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json to E:\test\en_US-lessac-medium.onnx.json
INFO:piper.download:Downloaded E:\test\en_US-lessac-medium.onnx.json (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json)
2023-11-17 12:49:13.1325434 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-11-17 12:49:13.1401462 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

Strangely, but it works way slower on GPU (I use RTX 2060) then it works on my CPU (i5-11400). Is it because of "Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf."?

Nov 17 '23 10:11 Ajaja

I was able to build piper-phonemize wheel for Windows and use piper with python and --cuda:

E:\test>echo 'Welcome to the world of speech synthesis!'   | E:\Python310\python.exe -m piper  --cuda --model en_US-lessac-medium --debug --output_file welcome.wav
DEBUG:__main__:Namespace(model='en_US-lessac-medium', config=None, output_file='welcome.wav', output_dir=None, output_raw=False, speaker=None, length_scale=None, noise_scale=None, noise_w=No
ne, cuda=True, sentence_silence=0.0, data_dir=['E:\\test'], download_dir=None, update_voices=False, debug=True)
DEBUG:piper.download:Loading E:\Python310\lib\site-packages\piper\voices.json
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx.json
WARNING:piper.download:Wrong size (expected=7010, actual=4885) for E:\test\en_US-lessac-medium.onnx.json
DEBUG:piper.download:Downloading https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json to E:\test\en_US-lessac-medium.onnx.json
INFO:piper.download:Downloaded E:\test\en_US-lessac-medium.onnx.json (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json)
2023-11-17 12:49:13.1325434 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-11-17 12:49:13.1401462 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

Strangely, but it works way slower on GPU (I use RTX 2060) then it works on my CPU (i5-11400). Is it because of "Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf."?

Same behaviour for me, I got a Nvidia Tesla P40 and Piper is way slower than the Xeon CPU, when transcoding is in process I can see the Python process consuming almost 10G of VRAM.

Running CUDA 11.8 in the container.

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.23.08              Driver Version: 545.23.08    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  Tesla P40                      Off | 00000000:07:00.0 Off |                    0 |
| N/A   42C    P0              50W / 250W |   6706MiB / 23040MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                                         
+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A      4024      C   /home/ovos/.venv/bin/python3               6142MiB |
|    0   N/A  N/A   1195867      C   /home/ovos/.venv/bin/python3                562MiB |
+---------------------------------------------------------------------------------------+

Dec 09 '23 16:12 goldyfruit

same error

Dec 10 '23 09:12 lonngxiang

I was able to build piper-phonemize wheel for Windows and use piper with python and --cuda:

E:\test>echo 'Welcome to the world of speech synthesis!'   | E:\Python310\python.exe -m piper  --cuda --model en_US-lessac-medium --debug --output_file welcome.wav
DEBUG:__main__:Namespace(model='en_US-lessac-medium', config=None, output_file='welcome.wav', output_dir=None, output_raw=False, speaker=None, length_scale=None, noise_scale=None, noise_w=No
ne, cuda=True, sentence_silence=0.0, data_dir=['E:\\test'], download_dir=None, update_voices=False, debug=True)
DEBUG:piper.download:Loading E:\Python310\lib\site-packages\piper\voices.json
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx.json
WARNING:piper.download:Wrong size (expected=7010, actual=4885) for E:\test\en_US-lessac-medium.onnx.json
DEBUG:piper.download:Downloading https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json to E:\test\en_US-lessac-medium.onnx.json
INFO:piper.download:Downloaded E:\test\en_US-lessac-medium.onnx.json (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json)
2023-11-17 12:49:13.1325434 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-11-17 12:49:13.1401462 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

Strangely, but it works way slower on GPU (I use RTX 2060) then it works on my CPU (i5-11400). Is it because of "Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf."?

How did you build the piper-phonemize on Windows? I am getting the following error:

''' "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DVERSION_INFO=1.2.0 -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\espeak-ng\build\include -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\lib\Linux-AMD64\onnxruntime\include -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-build-env-tcyf_rom\overlay\Lib\site-packages\pybind11\include -IC:\Users\everton.aleixo\AppData\Local\miniconda3\envs\piper-tts\include -IC:\Users\everton.aleixo\AppData\Local\miniconda3\envs\piper-tts\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared" /EHsc /Tpsrc/phoneme_ids.cpp /Fobuild\temp.win-amd64-cpython-310\Release\src/phoneme_ids.obj /std:c++latest /EHsc /bigobj phoneme_ids.cpp C:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\src\phoneme_ids.hpp(76): error C2015: too many characters in constant C:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\src\phoneme_ids.hpp(77): error C2015: too many characters in constant '''

Jan 08 '24 18:01 evertonaleixo

@evertonaleixo Just convert phoneme_ids.hpp to UTF-8 with BOM.

Jan 08 '24 18:01 Ajaja

@Ajaja On windows, I was able to build and install piper_phonemize and piper-tts python packages. However, I get this error when running the piper command.

The directory of venv/lib/site-packages/piper_phonemize only contains __pycache__, espeak-ng-data folder, __init__.py and libtashkeel_model.ort. I am not sure where from_piper_phonemize_cpp file is supposed to be at for __init__.py to import from.

Jan 22 '24 01:01 tn-17

@tkyin17 I've already deleted everything and can't check, but, If I'm nor mistaken, it needs espeak-ng-data folder in C:\usr\share\

Jan 22 '24 13:01 Ajaja

@Ajaja On windows, I was able to build and install piper_phonemize and piper-tts python packages. However, I get this error when running the piper command.

The directory of venv/lib/site-packages/piper_phonemize only contains __pycache__, espeak-ng-data folder, __init__.py and libtashkeel_model.ort. I am not sure where from_piper_phonemize_cpp file is supposed to be at for __init__.py to import from.

Did you find the solution ?

May 20 '24 10:05 Haurrus

Did you find the solution ?

I did not. I even tried building on linux but ran into the same importing piper_phonemize_cpp issue :(

I moved on to using https://github.com/k2-fsa/sherpa-onnx which implements its own tts inference that uses piper-phonemize.

May 20 '24 10:05 tn-17

I was able to build piper-phonemize wheel for Windows and use piper with python and --cuda:
E:\test>echo 'Welcome to the world of speech synthesis!'   | E:\Python310\python.exe -m piper  --cuda --model en_US-lessac-medium --debug --output_file welcome.wav
DEBUG:__main__:Namespace(model='en_US-lessac-medium', config=None, output_file='welcome.wav', output_dir=None, output_raw=False, speaker=None, length_scale=None, noise_scale=None, noise_w=No
ne, cuda=True, sentence_silence=0.0, data_dir=['E:\\test'], download_dir=None, update_voices=False, debug=True)
DEBUG:piper.download:Loading E:\Python310\lib\site-packages\piper\voices.json
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx
DEBUG:piper.download:Checking E:\test\en_US-lessac-medium.onnx.json
WARNING:piper.download:Wrong size (expected=7010, actual=4885) for E:\test\en_US-lessac-medium.onnx.json
DEBUG:piper.download:Downloading https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json to E:\test\en_US-lessac-medium.onnx.json
INFO:piper.download:Downloaded E:\test\en_US-lessac-medium.onnx.json (https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/lessac/medium/en_US-lessac-medium.onnx.json)
2023-11-17 12:49:13.1325434 [W:onnxruntime:, session_state.cc:1162 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2023-11-17 12:49:13.1401462 [W:onnxruntime:, session_state.cc:1164 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.
Strangely, but it works way slower on GPU (I use RTX 2060) then it works on my CPU (i5-11400). Is it because of "Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf."?
How did you build the piper-phonemize on Windows? I am getting the following error:

''' "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DVERSION_INFO=1.2.0 -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\espeak-ng\build\include -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\lib\Linux-AMD64\onnxruntime\include -IC:\Users\everton.aleixo\AppData\Local\Temp\pip-build-env-tcyf_rom\overlay\Lib\site-packages\pybind11\include -IC:\Users\everton.aleixo\AppData\Local\miniconda3\envs\piper-tts\include -IC:\Users\everton.aleixo\AppData\Local\miniconda3\envs\piper-tts\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.37.32822\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\um" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\shared" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\winrt" "-IC:\Program Files (x86)\Windows Kits\10\include\10.0.22621.0\cppwinrt" "-IC:\Program Files (x86)\Windows Kits\NETFXSDK\4.8\include\um" "-IC:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\ucrt" "-IC:\Program Files (x86)\Windows Kits\10\Include\10.0.22000.0\shared" /EHsc /Tpsrc/phoneme_ids.cpp /Fobuild\temp.win-amd64-cpython-310\Release\src/phoneme_ids.obj /std:c++latest /EHsc /bigobj phoneme_ids.cpp C:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\src\phoneme_ids.hpp(76): error C2015: too many characters in constant C:\Users\everton.aleixo\AppData\Local\Temp\pip-req-build-2ovc09ro\src\phoneme_ids.hpp(77): error C2015: too many characters in constant '''

@Ajaja @evertonaleixo @goldyfruit How did you managed to make it work ?

May 20 '24 13:05 Haurrus