PaddleSpeech [S2T] PaddleSpeech-Server-RESTful-API 不识别 pcm 格式，punc 参数不起作用

[S2T] PaddleSpeech-Server-RESTful-API 不识别 pcm 格式，punc 参数不起作用

Open mzgcz opened this issue 1 year ago • 4 comments

For support and discussions, please use our Discourse forums.

If you've found a bug then please create an issue with the following information:

Describe the bug PaddleSpeech-Server描述说语音识别服务支持pcm和wav两种格式，但输入pcm格式文件时，报以下错误：

raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name)) soundfile.LibsndfileError: Error opening <_io.BytesIO object at 0x7f9bc43d1b30>: Format not recognised. [2023-12-01 15:54:05,375] [ ERROR] - can not open the audio file, please check the audio file(<_io.BytesIO object at 0x7f9bc43d1b30>) format is 'wav'. you can try to use sox to change the file format. For example: sample rate: 16k sox input_audio.xx --rate 16k --bits 16 --channels 1 output_audio.wav sample rate: 8k sox input_audio.xx --rate 8k --bits 16 --channels 1 output_audio.wav

[2023-12-01 15:54:05,375] [ ERROR] - file check failed!

To Reproduce Steps to reproduce the behavior:

Go to '...'
Click on '....'
Scroll down to '....'
See error

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

OS: [e.g. Ubuntu]
GCC/G++ Version [e.g. 8.3]
Python Version [e.g. 3.7]
PaddlePaddle Version [e.g. 2.0.0]
Model Version [e.g. 2.0.0]
GPU/DRIVER Informationo [e.g. Tesla V100-SXM2-32GB/440.64.00]
CUDA/CUDNN Version [e.g. cuda-10.2]
MKL Version
TensorRT Version

Additional context Add any other context about the problem here.

Dec 01 '23 08:12 mzgcz

传递字段audio_format { "audio": "exSI6ICJlbiIsCgkgICAgInBvc2l0aW9uIjogImZhbHNlIgoJf...", "audio_format": "pcm", "sample_rate": 16000, "lang": "zh_cn", "punc": 0 }

Jan 02 '24 12:01 zxcd

wav和pcm格式传入punc参数都没用，没有补充标点符号

Jan 08 '24 08:01 beixiang-l

传递字段audio_format { "audio": "exSI6ICJlbiIsCgkgICAgInBvc2l0aW9uIjogImZhbHNlIgoJf...", "audio_format": "pcm", "sample_rate": 16000, "lang": "zh_cn", "punc": 0 }

我确认下："audio_format": "pcm"时，audio对应的是纯pcm载荷吧？因为如果携带的是wav载荷是没有问题的。而且我查看代码也只是对wav载荷处理，没有对纯pcm载荷的处理。

Jan 09 '24 03:01 mzgcz

我试了下，punc的参数在wav的情况下一样是无效的... data = { "audio": base64_string, "audio_format": "wav", "sample_rate": 32000, "lang": "zh_cn", "punc": True }

Jun 13 '24 03:06 warkcod

PaddleSpeech PaddleSpeech copied to clipboard

[S2T] PaddleSpeech-Server-RESTful-API 不识别 pcm 格式，punc 参数不起作用

PaddleSpeech
PaddleSpeech copied to clipboard