piper
piper copied to clipboard
Docker Image with REST API?
Hello, I can see that this project is the successor to Mycroft's Mimic3. I was wondering if there's a docker container for this project so that something like this can be achieved:
curl -X POST --data 'Hello world.' --output - localhost:59125/api/tts | aplay
At the moment, I have Mimic3 running on my Kubernetes cluster, and use it for a bunch of things, but it has some issues with parsing SSML, and it looks like the project is abandoned and the main developer has moved here.
I have integrated Mimic3 on my cluster into Home Assistant:
tts:
- platform: marytts
host: 10.25.29.113
port: 59125
codec: "WAVE_FILE"
language: "en_US"
voice: "en_US/hifi-tts_low#92"
- platform: google_translate
But would rather Piper be separate from Home Assistant and instead be managed by Kubernetes since it has high availability (several nodes and i7 CPUs) as apposed to a single RPi4. I didn't see anything on the main readme about running an API server in Python
it looks like it's possible: https://www.youtube.com/watch?v=pLR5AsbCMHs you can create a http server with the project. So If you modify the Dockerfile this could work
Ohh yeah, I know it's possible. Was hoping one already existed somewhere. I'll keep using Mimic3 for now if that's the case. If I ever get some more time I'll make one and publish it.
@Slyke I think I solved it. Use this dockerfile example to build a http server with the project:
FROM python:3.11-slim
# Set the working directory
WORKDIR /app
# Get the latest version of the code
RUN apt update && apt install -y git
RUN git clone https://github.com/rhasspy/piper
# Update pip and install the required packages
RUN pip install --upgrade pip
# Set the working directory
WORKDIR /app/piper/src/python_run
# Install the package
RUN pip install -e .
# Install the requirements
RUN pip install -r requirements.txt
# Install http server
RUN pip install -r requirements_http.txt
# Copy the folder of piper-voices/de into the container
COPY /copythis/ /app/models
# Expose the port 5000
EXPOSE 5000
# Run the webserver with python -m piper.http_server --model ...
CMD ["python", "-m", "piper.http_server", "-m", "/app/models/mls-medium.onnx"]
Make sure to reference your own .onnx model
Would love to see an official image for this kind of REST API, and am surprised it doesn't exist already. . I have tried sussing out the protocol and may try something like your suggestion.
@ErroneousBosch It's done: https://hub.docker.com/r/artibex/piper-http
first version can download hugging face models from the voices repo: https://huggingface.co/rhasspy/piper-voices/tree/v1.0.0
Example:
docker run -e MODEL_DOWNLOAD_LINK=https://huggingface.co/rhasspy/piper-voices/resolve/v1.0.0/en/en_US/kusal/medium/en_US-kusal-medium.onnx?download=true --name piper -p 5000:5000 artibex/piper-http
let me know if it also works for you
@artibex It does indeed work!
curl --header "Content-Type: application/json" --request POST --data 'Hello World' --output - "http://localhost:5000" |aplay
Works as expected. Seems to just repeat back whatever the data is, so no configuration of speaker, but definitely works and returns audio!
@ErroneousBosch my idea was to create one container for each speaker. So if you need other voices just put up a second container on a different port. No need to wait for a voice to load, just run the container and it works 👍
A issue now is that there is no authentication method at the moment. Everyone with the link and port can generate .wav files. Any idea how to do this?
Added the repo for the code of piper-http: https://github.com/artibex/piper-http
@artibex while I'm not a python programmer, I am a developer and looking at the code for the built in http_server, it doesn't look like there is a built-in way to specify an API key or authentication. To add it you'd need something more sophisticated in front of it.
Having some issues with this image:
CTRL+C
doesn't seem to kill the process when running in WSL2. I added some code into run.py
:
import signal
def signal_handler(sig, frame):
print('Terminating process...')
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
and also added -it --init
in the docker run command to try to fix this, with no success. It requires docker stop
to be run from another terminal to kill the process and free the first terminal.
It wants to download the model on startup each time, it should accept the model name as a parameter, and only download if it doesn't exist. This would require mounting the target_folder
, but that's not a big deal.
Python is not my strength, I might use nikolaik/python-nodejs:python3.11-nodejs20-slim
and spin up a NodeJS server that allow for voice switching, downloading models etc with the API. Based off of how the piper.http_server works, it looks like the piper process will have to be killed when switching voices, unless I can figure how to get it to stream from Python to NodeJS for download, then voices can be switched on the fly, and changing options etc.