FunASR
FunASR copied to clipboard
runtime error for blank audio file
🐛 Bug
When running the asr inference for a blank audio file, there might be a runtime error as shown in the following attached. Take the uploaded audio test data for example audio.mp3.zip, here is how the error come:
1、vad model output two segments [[t0,t1],[t0,t1]] 2、asr model comes out with "" and "" corresponding to the two vad segs, but final returns " " since " " is used as separator in combining seg results. 3、following steps do not handle the string " " as empty text.
To Reproduce
from funasr import AutoModel
model = AutoModel(model='./model_path/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch', model_revision="v2.0.4", vad_model='./model_path/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", punc_model='./model_path/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4", sentence_timestamp=True, return_raw_text=False, batch_size_s=30, )
res = model.generate(input="audio.mp3") print(res)
Expected behavior
For blank audio files, the inference result should be empty text, rather than a runtime error.
Environment
OS: Linux FunASR Version: 1.0.25 PyTorch Version: 1.12.0 How you installed funasr: pip Python version: 3.9 GPU: p40 CUDA/cuDNN version: 113