FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Results 484 FunASR issues
Sort by recently updated
recently updated
newest added

combined the multi blank into the funasr for transducer

ubuntu20.04, funasr&modelscope is last version. Lora finetune model can not decode: RuntimeError:AutomaticSpeechRecognitionPipeline:Error(s)in loading state_dict for Paraformer: unexpected key(s) in state_dict: "encoder.encoders0.0.self_attn.linear_q_k_v.lora_A"

bug

Thank you for contributing excellent code. I encountered a problem when using both online and offline SDKs Docker Version: registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-cpu-0.2.1 SDK: ./funasr-wss-server & ./funasr-wss-server-2pass Command:python3 funasr_wss_client.py --host "127.0.0.1" --port 10095...

bug

OS: mac Python/C++ Version:python 3.7.0 Package Version:torch 1.13.1、torchaudio 0.13.1、modelscope 1.6.1、funasr version 0.7.6 Model:damo/speech_fsmn_vad_zh-cn-16k-common-pytorch、damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404 Command: [音频和代码.zip](https://github.com/alibaba-damo-academy/FunASR/files/12591570/default.zip) 大体上是对同一段音频,分别利用pipeline的vad和onnx的vad,然后对两个vad的结果进行onnx的asr,结果是,因为vad结果的差异,造成asr里有些静音被识别成中文。 ` # 原始文件8K PCM TO 16K PCM speech = resample(fs=8000, audio_in=speech) # pipeline vad...

OS: linux Python/C++ Version:3.10/g++ (GCC) 5.4.0 Package Version:pytorch、torchaudio、modelscope、funasr version(pip list) Package Version ---------- ------- click 8.1.7 pip 23.2.1 setuptools 65.5.0 websockets 11.0.3 Model: The ASR model_id used : damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx The...

bug

作者您好,我注意到[ModelScope]的后处理标点恢复部分能够处理中文、英文以及中英混的情况,请问作者在后处理部分模型训练时,数据集是如何构建的呢? 比如中英混,中英混数据是通过什么方法构建的,谢谢作者

![image](https://github.com/alibaba-damo-academy/FunASR/assets/87591236/640b0898-4304-4da2-8de4-c60900a9e3ea) ![image](https://github.com/alibaba-damo-academy/FunASR/assets/87591236/1f382bc5-84c9-40ce-87e8-3aca82839b5b) 以下是执行测试案例中的音频运行效果 ![image](https://github.com/alibaba-damo-academy/FunASR/assets/87591236/c399b928-837f-41e8-89ce-d9eea30888ba)

OS: Linux Python/C++ Version: Python 3.7 Package Version: Model: damo/speech_fsmn_vad_zh-cn-16k-common-pytorch Details: `from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', vad_model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch', punc_model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', batch_size=64, ) audio_in='/home/FunASR/test_audio/20230807-142709-8026-018529420615-1691389629.668644.wav' rec_result...

指定了模型路径,仍然日志打印网络连接错误。 (funasr) ctc@HP-Z6-G4-Workstation:~/anaconda3/envs/funasr/FunASR/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch$ python finetune.py 2023-09-05 17:08:25,544 - modelscope - INFO - PyTorch version 1.13.1+cu117 Found. 2023-09-05 17:08:25,544 - modelscope - INFO - Loading ast index from /home/ctc/.cache/modelscope/ast_indexer 2023-09-05 17:08:25,607...

希望有采用websocket协议的c++ windows部署方案 最好出个Relese包