FunASR
FunASR copied to clipboard
热词模型是否会降低识别效果
使用/runtime/python/websocket下的代码进行测试,mode为2pass,在使用过程中发现支持热词的离线模型在总体识别效果上弱于非热词模型,热词模型更容易出现“嗯”“啊”之类的语气词,非热词模型会纠正掉,请问热词模型是否会降低识别效果?如有,应该如何解决?
asr_model_online:
- speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online
vad_model:
- speech_fsmn_vad_zh-cn-16k-common-pytorch
punc_model:
- punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727
离线模型:
- speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
热词模型:
- speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch
- speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404
What's your environment?
- OS (e.g., Linux): Linux
- FunASR Version (e.g., 1.0.0): 1.0.19
- PyTorch Version (e.g., 2.0.0): 1.13.1
- How you installed funasr (
pip, source): pip - Python version: 3.10.10