piper
piper copied to clipboard
Audio streaming error
How do I fix this?
$ echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | ./piper --model en_US-libritts-high.onnx --output-raw | aplay -r 22050 -f S16_LE -t raw -
aplay: main:834: audio open error: Operation not supported
[2024-02-03 13:27:34.711] [piper] [info] Loaded voice in 0.325418684 second(s)
[2024-02-03 13:27:34.713] [piper] [info] Initialized piper
One suggestion might be to change the reference to output raw to refer to an actual created file that can be examined independently in order to see if the piper
or aplay
part is having a problem.
Don't forget -c 1
on the aplay command for mono audio. You may also need to use -D <device>
to choose a different audio device.
change the reference to output raw to refer to an actual created file
This works perfectly fine
Don't forget -c 1 on the aplay command for mono audio. You may also need to use -D
to choose a different audio device.
The device names look like this. Which part of the output do I specify as device name?
**** List of CAPTURE Hardware Devices ****
card 1: Generic [HD-Audio Generic], device 0: ALC1220 Analog [ALC1220 Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
card 1: Generic [HD-Audio Generic], device 2: ALC1220 Alt Analog [ALC1220 Alt Analog]
Subdevices: 1/1
Subdevice #0: subdevice #0
Use aplay -L
(capital L) to get the device names. Look for ones that begin with plughw:
since they will automatically convert the audio format for you.
For example, I have something like this:
plughw:CARD=PCH,DEV=0
HDA Intel PCH, ALC285 Analog
Hardware device with all software conversions
so I would need aplay -D "plughw:CARD=PCH,DEV=0" ...
There is no device name starting with plughw
null
Discard all samples (playback) or generate zero samples (capture)
pipewire
PipeWire Sound Server
default
Default ALSA Output (currently PipeWire Media Server)
hdmi:CARD=NVidia,DEV=0
HDA NVidia, LG FULL HD
HDMI Audio Output
hdmi:CARD=NVidia,DEV=1
HDA NVidia, HDMI 1
HDMI Audio Output
hdmi:CARD=NVidia,DEV=2
HDA NVidia, HDMI 2
HDMI Audio Output
hdmi:CARD=NVidia,DEV=3
HDA NVidia, HDMI 3
HDMI Audio Output
sysdefault:CARD=Generic
HD-Audio Generic, ALC1220 Analog
Default Audio Device
front:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
Front output / input
surround21:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
2.1 Surround output to Front and Subwoofer speakers
surround40:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
4.0 Surround output to Front and Rear speakers
surround41:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
4.1 Surround output to Front, Rear and Subwoofer speakers
surround50:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
5.0 Surround output to Front, Center and Rear speakers
surround51:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
5.1 Surround output to Front, Center, Rear and Subwoofer speakers
surround71:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Analog
7.1 Surround output to Front, Center, Side, Rear and Woofer speakers
iec958:CARD=Generic,DEV=0
HD-Audio Generic, ALC1220 Digital
IEC958 (S/PDIF) Digital Audio Output
It looks like you are running PipeWire. Have you tried the default
one?
echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | ./piper --model en_US-libritts-high.onnx --output-raw | aplay -D "default" -r 22050 -f S16_LE -t raw -
I'm noticing a similar issue in piper 1.2.0, but with an additional clue to add to the discussion. It seems that piping the output of piper to aplay will fail if there is more than one sentence to speak. It crashes with an illegal seek, presumably in regards to seeking in the output pipe.
echo "What was is the nature of this sentence? Is this sentence number two? Or is it number three?" | piper --model en_GB-aru-medium --speaker 4 --debug | aplay -r 22050 -f S16_LE -t raw -
Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono
DEBUG:__main__:Namespace(model='en_GB-aru-medium', config=None, output_file=None, output_dir=None, output_raw=False, speaker=4, length_scale=None, noise_scale=None, noise_w=None, cuda=False, sentence_silence=0.0, data_dir=['/home/user/AI/Models/piper-voices', '/home/user/AI/Models/piper-voices'], download_dir='/home/user/AI/Models/piper-voices', update_voices=False, debug=True)
DEBUG:piper.download:Loading /home/user/AI/Models/piper-voices/voices.json
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx.json
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx.json
Traceback (most recent call last):
File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/__main__.py", line 151, in main
voice.synthesize(text, wav_file, **synthesize_args)
File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/voice.py", line 103, in synthesize
wav_file.writeframes(audio_bytes)
File "/usr/lib/python3.10/wave.py", line 439, in writeframes
self._patchheader()
File "/usr/lib/python3.10/wave.py", line 494, in _patchheader
curpos = self._file.tell()
OSError: [Errno 29] Illegal seek
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/user/.local/bin/piper", line 8, in <module>
sys.exit(main())
File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/__main__.py", line 150, in main
with wave.open(sys.stdout.buffer, "wb") as wav_file:
File "/usr/lib/python3.10/wave.py", line 332, in __exit__
self.close()
File "/usr/lib/python3.10/wave.py", line 446, in close
self._patchheader()
File "/usr/lib/python3.10/wave.py", line 494, in _patchheader
curpos = self._file.tell()
OSError: [Errno 29] Illegal seek
You need to use --output-raw
like this: https://github.com/rhasspy/piper?tab=readme-ov-file#streaming-audio