Thank you for this amazing project. Do you need help?
Thank for working on this project -- I was looking for something I'd be able to compile once and run forever. It's awesome. I'm using the C++ version of it on Macbook Pro + Framework 13 and on servers, with en_US-hfc_male-medium.onnx; pretty much everyday since I found this repo ~1.5mo ago. It improved my workflow, and the voice quality is very good for what I'm using it for.
Do you guys need any help here?
I definitely do need help, thanks 😄
I've been reworking the code base here: https://github.com/OHF-Voice/piper1-gpl I don't have a C++ version there just yet; I wanted to focus on the Python version followed by a proper libpiper with a C API. I'm thinking of reworking the Python code to the stable ABI so I can have a single wheel per platform instead of needing every combination of platform and Python version.
Something I want to do is get the GitHub Actions set up to publish automatically to PyPI, etc. when a new version is tagged.
What do you think?
The vision for this project could be to become Ollama for Voice. I couldn't find anything else, standalone, stable, other than piper and espeak-ng, and piper voices were so much better. espeak-ng didn't stand the test of time. In other words: the C++ runner is what I came for.
What distinguishes piper's Python version from other TTSes that came out recently?
Three things distinguish Piper's Python version (in piper1-gpl):
- The only dependency is
onnxruntime, so a fresh install is less than 300MB as opposed to multiple gigabytes of dependencies from every other Python-based TTS system I've seen. - Piper is phoneme-based, so fine-tuning models across languages becomes feasible. Almost every voice is fine-tuned from one of the good English ones (lessac). This also means new words/pronunciations can be added (not super easy with espeak-ng, but possible).
- A large number of languages are supported. Most TTS systems have English and/or Chinese models, or are focused on one other language. Piper currently supports about 40 languages/dialects.
The vision for this project could be to become Ollama for Voice.
I think it could be as long as the models are in onnx format and support a specific interface. Piper 1 models take in a list of phoneme ids and inference settings (length/noise scale), and produce only audio. If more elaborate models are to be supported, we would need to know in the config file and adjust accordingly.
@synesthesiam this is great.
Do you care about the C++ implementation? If I helped with the C++ patches review, would you merge them?
I don't see any real benefit in writing the project in C++.
Almost any computer, including Raspberry Pi, can run Python these days.
There isn't much benefit in terms of speed, really.
It will only add more and more complexity to the project.
See this package: piper-onnx I've written Piper inference code in about 100 lines, and it includes built-in espeak-ng that works on all platforms without requiring system-wide installation.
Please let me know why you would write it in C++.
As someone who loves writing in C++ and Rust, I don't see the point.
Python works best these days with Astral UV.
I use Rust and C++ for things that make sense, such as low-level code and firmware, but TTS doesn't really require that.
Also, if we want to support many languages with many phonemizers, keep in mind that most phonemizers are written in Python.
Using both Python and binaries will only complicate things.
@thewh1teagle this repo already has the C++ code in https://github.com/rhasspy/piper/tree/master/src/cpp
I was wondering if @synesthesiam is still interesting in maintaining it.
I believe that piper train code + inference pypi library could be few hundreds lines of code (except for some model definitions) The pre processing can be in other project I may write it some time
@synesthesiam I am not sure how this adds up to this conversation. But, I think, I should make this point too.. I am not a pyton developer nor c++... however using piper for the last 1.5 year... your binary piper.exe on win 10 and win 11 + my ubuntu 24.0 and my mac laptop...... I have this piper.exe path added to my golang built desktop app and runs piper through cli commands..so, people like me do need piper binaries to play with... I hope you would still consider adding executables for each platform in this newer github repo.
btw, thank you so much for such an amazing project. Let me know if I can help you with Golang, PHP, Nodejs?
I agree.
This is an amazing project!
I am not a developer, I am a totally blind computer user who has NVDA as my Windows 11 Screen reader, an I have the Sonata NVDA addon package for my default voices.
But that means that I can only use my NVDA Screen Reader to make my Piper voices talk.
I wish there could be a way to use my Piper voices with all my TTS apps like Balabolka.
I guess that would mean SAPI5.
But they say that Microsoft makes it really hard to develop new and better SAPI voices.
My favourite voice is the Librits_R voices package and speaker #373.
But I can't call her #373 so I call her Rachel.
Haha.
Your thoughts?