RealtimeTTS icon indicating copy to clipboard operation
RealtimeTTS copied to clipboard

wave.Error: file does not start with RIFF id

Open philuxzhu opened this issue 2 years ago • 4 comments

I test the tests/chinese_test.py but there is an error. Does anyone know how to solve it?

Traceback: Traceback (most recent call last): File "/Users/zhujunming/Desktop/AIQQ/tts/RealtimeTTS/RealtimeTTS/text_to_stream.py", line 265, in synthesize_worker success = self.engine.synthesize(sentence) File "/Users/zhujunming/Desktop/AIQQ/tts/RealtimeTTS/RealtimeTTS/engines/system_engine.py", line 72, in synthesize with wave.open(self.file_path, 'rb') as wf: File "/usr/local/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/wave.py", line 509, in open return Wave_read(f) File "/usr/local/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/wave.py", line 163, in init self.initfp(f) File "/usr/local/Cellar/[email protected]/3.10.13_1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/wave.py", line 130, in initfp raise Error('file does not start with RIFF id') wave.Error: file does not start with RIFF id

philuxzhu avatar Dec 05 '23 03:12 philuxzhu

Maybe the sentence tokenizer messes up.

Let us verify this. To add logging so we can see the text which is sent to the engine:

import logging
logging.basicConfig(level=logging.DEBUG)    
engine = SystemEngine(level=logging.DEBUG)

And then in the play or play_async add this param:

stream.play(log_synthesized_text=True)

Now you should see log message like this in the CLI:

INFO:root:synthesizing: <TEXT>

Is the text displayed correctly there or does the text look messed up?

KoljaB avatar Dec 05 '23 11:12 KoljaB

I have the same problem. I am using macOS Ventura.

Is the text displayed correctly there

Yes. @KoljaB

The problem is, that the format of the file generated by SystemEngine is AIFF instead of ~~wav~~.

(base) ➜  ~ mediainfo system_speech_synthesis.wav
General
Complete name                            : system_speech_synthesis.wav
Format                                   : AIFF
Format/Info                              : Apple/SGI
File size                                : 53.8 KiB
Duration                                 : 1 s 157 ms
Overall bit rate mode                    : Constant
Overall bit rate                         : 381 kb/s
FileExtension_Invalid                    : aiff aifc aif

Audio
Format                                   : PCM
Format settings                          : Big / Signed
Codec ID                                 : twos
Duration                                 : 1 s 157 ms
Bit rate mode                            : Constant
Bit rate                                 : 352.8 kb/s
Channel(s)                               : 1 channel
Sampling rate                            : 22.05 kHz
Bit depth                                : 16 bits
Stream size                              : 49.8 KiB (93%)

badbye avatar Dec 07 '23 06:12 badbye

https://github.com/nateshmbhat/pyttsx3/issues/142#issuecomment-1013533459

Confirmed. I have tested pyttsx3.save_to_file, it uses aiff format on MacOS and wav format on Linux.

badbye avatar Dec 07 '23 07:12 badbye

Thank you very much @badbye for pointing this out.

Fix is now available in v0.3.34

KoljaB avatar Dec 07 '23 11:12 KoljaB