FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Results 555 FunASR issues
Sort by recently updated
recently updated
newest added

![PixPin_2024-08-08_00-45-37](https://github.com/user-attachments/assets/843a5787-4da8-4a2d-ae1a-bbf824e9dc1d) 我是CentOS Linux release 7.9.2009 (Core),无论是使用一键部署还是遵照docker镜像拉取部署,最终都是连接失败,点击授权打不开网页。是否是这个linux系统不支持?

question

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help #### What is your question? 你好,我使用paraformer_zh模型进行微调,数据中存在一些不被词典收录的英语单词(类似于funasr),但是使用微调后的模型预测一些带有这些单词的音频时,并没有得到想要的结果。 想询问如果数据是一些带有专有名词的英语单词,需要怎么微调才能得到想要的结果,能够识别这些专有名词。 #### Code 微调代码没有修改,使用的是https://github.com/modelscope/FunASR/blob/main/examples/industrial_data_pretraining/paraformer/finetune.sh...

question

funasr-http-server收到请求时崩溃,该问题必现,使用curl发送请求,curl -F '[email protected]' 127.0.0.1:10099 ## 🐛 Bug 运行funasr-http-server,按照文档所提供的请求方式 curl -F '[email protected]' 127.0.0.1:10099,服务崩溃,提示如下: ![image](https://github.com/user-attachments/assets/e3a2f403-0bdc-4bf9-8394-a430ea07da45) ![image](https://github.com/user-attachments/assets/e6dd0e99-bc5e-4b62-80eb-8037d6451689) ![image](https://github.com/user-attachments/assets/4a95617a-ba31-437e-8c2c-2db2dbf7b0ff) 似乎是在解析POST内容的时候崩溃了

bug

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## 🐛 Bug paraformer-en识别结果完全不对,可能是词表存在问题。如下图所示,识别结果中出现de等非英文字符。 ![image](https://github.com/user-attachments/assets/8411d37b-a248-4665-8007-e8e8127996d4) 已按如下步骤更新过funasr,还是存在这一问题。 ![image](https://github.com/user-attachments/assets/70e03ae9-4d2d-41fd-8568-7e1820b7299e) 若是词表存在问题,tokens.json好像是在模型文件中,是否应该更新paraformer-en模型的版本? ### To Reproduce Steps to reproduce the...

bug

## 🐛 Bug 使用pipeline进行ASR时,当输入是scp文件,ASR结束后进程不解除输出目录的文件占用 ### To Reproduce 运行以下代码 ``` from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks import shutil inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='iic/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch', model_revision="v2.0.4") scp = 'path/to/your/scp/0.scp' inference_pipeline(input=scp, output_dir='./output_dir')...

bug

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help This is issue is same as https://github.com/modelscope/FunASR/issues/1916 I have follow...

question

https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_vad_punc_example.wav https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav 这两个音频为什么识别的都是空的内容呢 [{'key': 'asr_vad_punc_example', 'text': '', 'timestamp': []}] cpu本地跑 代码 modelFile = "paraformer-zh" model = AutoModel(model=modelFile, vad_model="fsmn-vad", punc_model="ct-punc", spk_model="cam++", log_level="debug", hub="ms", ) res = model.generate( input=fileSrc, batch_size_s=300 ) print(f"结果{res}")...

from funasr import AutoModel model = AutoModel( model="iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch", vad_model="iic/speech_fsmn_vad_zh-cn-16k-common-pytorch", punc_model="iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch", spk_model="iic/speech_campplus_sv_zh-cn_16k-common", ) res = model.generate( input="D:\FunASR-main\output.wav", ) print(res) 'sentence_info': [{'text': '中国的国家主席是谁?', 'start': 1430, 'end': 3575, 'timestamp': [[1430, 1630], [1630, 1850],...

question

我是用官方的python代码进行模型推理, `asr_model.generate(filename,batch_size_s=300, merge_length_s=15, hotword='魔搭')` 发现第一个出现的进度显示rtf为0.831左右,按照代码逻辑, 这应该是vad model的处理效率?这是正常的吗?是否有可能对其进行加速呢?

question