FunASR
                                
                                
                                
                                    FunASR copied to clipboard
                            
                            
                            
                        A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
C#项目AliParaformerAsr.Examples是否可以支持时间戳和增加标点符号,说话人分离示例。 或者能提供一个思路我自己来实现。
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search the...
按照https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_online_zh.md在同一台主机上部署实时语音识别的服务端和客户端,部署后第一次测试成功,客户端可以连接上服务端并返回识别结果。但是使用命令ps -x | grep funasr-wss-server-2pass \kill -9 PID查询并关闭服务端后,再使用命令nohup bash run_server_2pass.sh \ --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \ --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir thuduj12/fst_itn_zh \ --certfile...
我的录音asr识别结果中str长度为7424,时间长度是6767,其中标点有[?,。.、]五种共652个,长度还是差5个,统计了一下顿号刚好出现五次,这是正常的吗,给的测试音频倒是长度匹配。
### 问题 1. 比如语音输入“二十二”,出现`TypeError: 'list' object cannot be interpreted as an integer`问题 2. 还有怎么能让输出为数字,比如电话号码 ### 代码 ```python funasr_model= AutoModel(model="./FunASR/paraformer-zh", vad_model="./FunASR/fsmn-vad", punc_model="./FunASR/ct-punc", device="cuda:3") filename = "./audio.wav" with open(filename, "wb") as f:...
## 🐛 Bug:按照教程fineturn 模型报错:forward() missing 4 required positional arguments: 'speech', 'speech_lengths', 'text', and 'text_lengths' ### To Reproduce 按照教程https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/industrial_data_pretraining/paraformer/README_zh.md fineturn 模型步骤: 1. cd examples/industrial_data_pretraining/paraformer 2. sh train_from_local.sh 3. train.jsonl 和 val.jsonl...
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help 看了一下modelscope的越南语UniASR的预训练模型的token.json文件,里面的token有大量包含了 "", "", "-@@", "'", "'@@", "a", "a@@", token文件里面的@符号有什么特殊的含义吗?我用aishell/paraformer/utils/text2token.py生成的token文件里面就没有@符号相关的
## ❓ Questions and Help 如果使用单卡,T4会出现OOM,我有多张T4卡,应该怎么修改才可以使用多卡呢? #### Code ``` from funasr import AutoModel model = AutoModel(model="Qwen-Audio", model_path="models/Qwen-Audio" ) audio_in = "39937061.wav" # 1st dialogue turn prompt = '这个内容是什么?' cache =...
## ❓ Questions and Help 在钉钉群沟通过了,说是uniasr模型适配的问题,希望尽快能够解决  #### What have you tried? 阅读源码,已确定是模型问题 #### What's your environment? - linux - funasr 1.0.10 - ModelScope Version 1.12.0
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) 我使用了UniASR的实时越南语模型,https://modelscope.cn/models/iic/speech_UniASR_asr_2pass-vi-16k-common-vocab1001-pytorch-online/summary 因为UniASR下面目前还没更新微调和训练的脚本,我使用的是https://github.com/alibaba-damo-academy/FunASR/tree/main/examples/industrial_data_pretraining/paraformer,的微调脚本进行训练的,训练参数保持和脚本中的一致,改了一下预训练的模型为UniASR中的越南语模型,训练了100epoch之后,准确率一直保持在80%左右  ## ❓ Questions and Help 我是否需要重新去生成tokens和cmvn文件进行重新训练? ### Before asking: 1. search the...