FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

使用该paraformer-8k离线模型识别结果错误

Open SmellStone opened this issue 1 month ago • 0 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

在使用该 8k ASR 模型iic/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1进行中文语音识别时,模型输出的文本结果明显错误 测试音频使用模型自带示例音频(asr_example.wav,采样率 8 kHz)。 音频内容为:每一天都要开心哦。两种方式均识别为:第千站站十千伏美一线专用花容

Code

使用funasr和modelscope.pipline的方式均错误 funasr代码如下: from funasr import AutoModel

model = AutoModel( model="iic/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1", model_revision="v2.0.4", ) res = model.generate( input=f"{model.model_path}/example/asr_example.wav", batch_size_s=300, ) print(res)

modelscope代码如下: from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks

inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model="iic/speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1", model_revision="v2.0.4", )

rec_result = inference_pipeline( "models/speech_paraformer_asr_nat-zh-cn-8k/example/asr_example_8K.wav" ) print(rec_result)

What have you tried?

当更换更早版本的funasr库时,识别文本会产生变化,但均不是正确内容 16k版本paraformer-zh识别正确 请问是模型的问题还是其他问题,应该如何解决这个问题? 希望得到您的回答,谢谢!

What's your environment?

  • OS (e.g., Linux):
  • FunASR Version (e.g., 1.2.7):
  • ModelScope Version (e.g., 1.31.0):
  • PyTorch Version (e.g., 2.9.0):
  • How you installed funasr (pip, source):
  • Python version:3.10
  • GPU (A100-80G)
  • CUDA/cuDNN version (cuda12.6):

SmellStone avatar Oct 20 '25 07:10 SmellStone