piper
piper copied to clipboard
Piper as Library?
Can Piper be used as a library directly in an application, or is it necessary to create a wrapper for the Piper executable?
TLDR; There is a python module for piper which I haven't found any documentation for, so I've written a simple 1 script wrapper for it, but there is also a more feature filled library called dimits which is a bit buggy which I did not make (https://pypi.org/project/dimits/).
Hello, I assume by this issue that you want to use the text to speech of piper in a python script. There is indeed already an official python module for piper, but I haven't been able to find any documentation for it. There are two ways I think you could implement piper's text to speech using this library:
- Using the dimits library https://pypi.org/project/dimits/ (I didn't make this.)
- By using a simple script I made
If you want to use the dimits library I recommend going to the PyPi page and reading through the guide there, https://pypi.org/project/dimits/.
If you want to just pick and choose the functions you need from the wrapper, or want to import it as a module, then you first to need to install the PyPi library for the official piper library and pygame to play the audio files generated cross-platform pip install piper-tts pygame, then choose a voice from https://rhasspy.github.io/piper-samples/ and download it. The script is below:
from piper.voice import PiperVoice as piper #Backbone of text to speech
import wave #Writing text to speech to wave files
from sys import platform
from os import remove
from os import environ
environ['PYGAME_HIDE_SUPPORT_PROMPT'] = "hide" #Stop pygame from saying hello in the console
from pygame import mixer
mixer.init()
#Check if the operating system is MacOS
if platform == 'darwin':
print('This library cannot be used on MacOS yet, due to piper not being supported there.')
exit() #Piper isn't available on MacOS yet
model = None #Set the model to none at the start
voice = None #Set the voice to none at the start
#Function to load and set the voice model to be used
def load(model_set):
global model
global voice
if '.onnx' in model_set: #Is the extension .onnx in the filename, if not add it below.
model = model_set #Set model variable as is.
else:
model = model_set + '.onnx' #Set model variable and append .onnx.
#Try to load the model
try:
voice = piper.load(model) #Load the model
except:
print("Something went wrong, did you type the correct name for the model?")
exit()
#Function to save a text to speech file to disk
def save(text, file_name, model_set=model):
global model
global voice
if model_set == None and model == None: #Check if a model has been set, if not then exit.
print("No model was set! Please set a model using the load function: load(\"your_model_here\")")
exit()
elif model_set != None: #Is there a voice that should be used only once?
temp_voice = piper.load(model_set)
with wave.open(file_name, "wb") as wav_file:
temp_voice.synthesize(text, wav_file)
else: #If not, use the voice that was set eariler.
with wave.open(file_name, "wb") as wav_file:
voice.synthesize(text, wav_file)
#Function to save the file to disk, play it on the speakers, then delete the file.
def say(text, model_set=model):
global model
global voice
if model_set == None and model == None: #Check if a model has been set, if not then exit.
print("No model was set! Please set a model using the load function: load(\"your_model_here\")")
exit()
elif model_set != None: #Is there a voice that should only be used once?
temp_voice = piper.load(model_set)
with wave.open('tmp_text_2_speech.wav', "wb") as wav_file:
temp_voice.synthesize(text, wav_file)
else: #If not, use the voice that was set earlier.
with wave.open('tmp_text_2_speech.wav', "wb") as wav_file:
voice.synthesize(text, wav_file)
#Play the file
mixer.music.load('tmp_text_2_speech.wav')
mixer.music.set_volume(1)
mixer.music.play()
while mixer.music.get_busy():
pass
remove("tmp_text_2_speech.wav") #Remove the temporary file
A simple example is of using this script is:
import whatever_you_named_the_script as engine
engine.load('ModelNameHere')
engine.say('I want to eat ice cream after this.')
Good luck, and if I was wildly off topic and wrong in my answer please let me know, and I'll try to help.
Nice, thanks for the script. I was looking for a c++ library but maybe looking at the source of piper-tts helps.... but thanks anyway!
Okay, I only know a little bit about C++, but I believe that you will be able to either use or take inspiration from the src/cpp/ directory in this repo. main.ccp seems to also have a command line usage that you could pull apart. Good luck!
This project may be of interest: https://voicedock.app/apps/ttspiper/ - wraps Piper with a gRPC API (I haven't used it myself)
Any progress on this? I am interested cause for now I'm using system but I'd want to avoid it :S
Maybe you find the project https://github.com/k2-fsa/sherpa-onnx interesting.
It supports all vits models from piper and we have converted all of the English/French/German/Spanish models from piper to sherpa-onnx. You can try them by visiting the following huggingface space https://huggingface.co/spaces/k2-fsa/text-to-speech
There are also pre-built Android APKs for it. Please see https://k2-fsa.github.io/sherpa/onnx/tts/apk.html
If you want to use sherpa-onnx in your own application, we provide APIs for C++/C/Python/Go/Swift/Kotlin/C#, etc. You can build sherpa-onnx either as a static or a dynamic library. It supports linux/macos/windows.
For instance, you can follow https://k2-fsa.github.io/sherpa/onnx/c-api/index.html to use the C API of sherpa-onnx.
https://github.com/k2-fsa/sherpa-onnx/blob/master/c-api-examples/offline-tts-c-api.c is an example using the C API of sherpa-onnx for TTS.
If you want a minimal version of @mario872 's answer:
from piper.voice import PiperVoice as piper
import wave
voice = piper.load("en_US-lessac-medium.onnx")
with wave.open("test.wav", "wb") as wav_file:
voice.synthesize("This is a test", wav_file)
The next version of Piper will be usable via a C API and a Python API.
Great to hear. Also is there a way currently via python to pass in multiple sentence? The readme says with a pipe it can be done using json format but I can't find how to do it in python
@synesthesiam Neat, a C API would be incredibly useful, especially for game design.
Do you have an idea when the next version of Piper will be released?
+1 for a C or C++ API
+1