piper icon indicating copy to clipboard operation
piper copied to clipboard

Audio streaming error

Open rew1nter opened this issue 1 year ago • 8 comments

How do I fix this?

 $ echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' |   ./piper --model en_US-libritts-high.onnx --output-raw |   aplay -r 22050 -f S16_LE -t raw -
  
aplay: main:834: audio open error: Operation not supported
[2024-02-03 13:27:34.711] [piper] [info] Loaded voice in 0.325418684 second(s)
[2024-02-03 13:27:34.713] [piper] [info] Initialized piper
  

rew1nter avatar Feb 03 '24 07:02 rew1nter

One suggestion might be to change the reference to output raw to refer to an actual created file that can be examined independently in order to see if the piper or aplay part is having a problem.

colbec avatar Feb 04 '24 17:02 colbec

Don't forget -c 1 on the aplay command for mono audio. You may also need to use -D <device> to choose a different audio device.

synesthesiam avatar Feb 04 '24 17:02 synesthesiam

change the reference to output raw to refer to an actual created file

This works perfectly fine

Don't forget -c 1 on the aplay command for mono audio. You may also need to use -D to choose a different audio device.

The device names look like this. Which part of the output do I specify as device name?

**** List of CAPTURE Hardware Devices ****
card 1: Generic [HD-Audio Generic], device 0: ALC1220 Analog [ALC1220 Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0
card 1: Generic [HD-Audio Generic], device 2: ALC1220 Alt Analog [ALC1220 Alt Analog]
  Subdevices: 1/1
  Subdevice #0: subdevice #0

rew1nter avatar Feb 04 '24 18:02 rew1nter

Use aplay -L (capital L) to get the device names. Look for ones that begin with plughw: since they will automatically convert the audio format for you. For example, I have something like this:

plughw:CARD=PCH,DEV=0
    HDA Intel PCH, ALC285 Analog
    Hardware device with all software conversions

so I would need aplay -D "plughw:CARD=PCH,DEV=0" ...

synesthesiam avatar Feb 04 '24 18:02 synesthesiam

There is no device name starting with plughw

null
    Discard all samples (playback) or generate zero samples (capture)
pipewire
    PipeWire Sound Server
default
    Default ALSA Output (currently PipeWire Media Server)
hdmi:CARD=NVidia,DEV=0
    HDA NVidia, LG FULL HD
    HDMI Audio Output
hdmi:CARD=NVidia,DEV=1
    HDA NVidia, HDMI 1
    HDMI Audio Output
hdmi:CARD=NVidia,DEV=2
    HDA NVidia, HDMI 2
    HDMI Audio Output
hdmi:CARD=NVidia,DEV=3
    HDA NVidia, HDMI 3
    HDMI Audio Output
sysdefault:CARD=Generic
    HD-Audio Generic, ALC1220 Analog
    Default Audio Device
front:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    Front output / input
surround21:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    2.1 Surround output to Front and Subwoofer speakers
surround40:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    4.0 Surround output to Front and Rear speakers
surround41:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    4.1 Surround output to Front, Rear and Subwoofer speakers
surround50:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    5.0 Surround output to Front, Center and Rear speakers
surround51:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    5.1 Surround output to Front, Center, Rear and Subwoofer speakers
surround71:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Analog
    7.1 Surround output to Front, Center, Side, Rear and Woofer speakers
iec958:CARD=Generic,DEV=0
    HD-Audio Generic, ALC1220 Digital
    IEC958 (S/PDIF) Digital Audio Output

rew1nter avatar Feb 05 '24 09:02 rew1nter

It looks like you are running PipeWire. Have you tried the default one?

echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | ./piper --model en_US-libritts-high.onnx --output-raw | aplay -D "default" -r 22050 -f S16_LE -t raw -

odurc avatar Mar 06 '24 16:03 odurc

I'm noticing a similar issue in piper 1.2.0, but with an additional clue to add to the discussion. It seems that piping the output of piper to aplay will fail if there is more than one sentence to speak. It crashes with an illegal seek, presumably in regards to seeking in the output pipe.

echo "What was is the nature of this sentence? Is this sentence number two? Or is it number three?"  |  piper --model en_GB-aru-medium --speaker 4 --debug  |  aplay -r 22050 -f S16_LE -t raw -

Playing raw data 'stdin' : Signed 16 bit Little Endian, Rate 22050 Hz, Mono
DEBUG:__main__:Namespace(model='en_GB-aru-medium', config=None, output_file=None, output_dir=None, output_raw=False, speaker=4, length_scale=None, noise_scale=None, noise_w=None, cuda=False, sentence_silence=0.0, data_dir=['/home/user/AI/Models/piper-voices', '/home/user/AI/Models/piper-voices'], download_dir='/home/user/AI/Models/piper-voices', update_voices=False, debug=True)
DEBUG:piper.download:Loading /home/user/AI/Models/piper-voices/voices.json
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx.json
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx
DEBUG:piper.download:Checking /home/user/AI/Models/piper-voices/en_GB-aru-medium.onnx.json
Traceback (most recent call last):
  File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/__main__.py", line 151, in main
    voice.synthesize(text, wav_file, **synthesize_args)
  File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/voice.py", line 103, in synthesize
    wav_file.writeframes(audio_bytes)
  File "/usr/lib/python3.10/wave.py", line 439, in writeframes
    self._patchheader()
  File "/usr/lib/python3.10/wave.py", line 494, in _patchheader
    curpos = self._file.tell()
OSError: [Errno 29] Illegal seek

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/user/.local/bin/piper", line 8, in <module>
    sys.exit(main())
  File "/home/user/.local/pipx/venvs/piper-tts/lib/python3.10/site-packages/piper/__main__.py", line 150, in main
    with wave.open(sys.stdout.buffer, "wb") as wav_file:
  File "/usr/lib/python3.10/wave.py", line 332, in __exit__
    self.close()
  File "/usr/lib/python3.10/wave.py", line 446, in close
    self._patchheader()
  File "/usr/lib/python3.10/wave.py", line 494, in _patchheader
    curpos = self._file.tell()
OSError: [Errno 29] Illegal seek

xandark avatar Apr 21 '24 23:04 xandark

You need to use --output-raw like this: https://github.com/rhasspy/piper?tab=readme-ov-file#streaming-audio

synesthesiam avatar Apr 22 '24 17:04 synesthesiam