FunASR issues

C#项目AliParaformerAsr.Examples是否可以支持时间戳或者标点符号

C#项目AliParaformerAsr.Examples是否可以支持时间戳和增加标点符号，说话人分离示例。或者能提供一个思路我自己来实现。

question

使用speaker diarization MISS错误率很高，请问是vad模块效果不好吗？还有结合视频的DER结果效果比单音频的还要差，请问这可以微调嘛？

1

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节） ## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search the...

Coconut059

question

重新开启服务端后发现客户端无法正常连接

按照https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_online_zh.md在同一台主机上部署实时语音识别的服务端和客户端，部署后第一次测试成功，客户端可以连接上服务端并返回识别结果。但是使用命令ps -x | grep funasr-wss-server-2pass \kill -9 PID查询并关闭服务端后，再使用命令nohup bash run_server_2pass.sh \ --model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \ --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \ --itn-dir thuduj12/fst_itn_zh \ --certfile...

wenjiwei

question

生成标点包含、等时会出现字符长度与时间长度不相等

我的录音asr识别结果中str长度为7424，时间长度是6767，其中标点有[？，。.、]五种共652个，长度还是差5个，统计了一下顿号刚好出现五次，这是正常的吗，给的测试音频倒是长度匹配。

GioGioBond

question

直接语音输入比较短的句子会出问题

### 问题 1. 比如语音输入“二十二”，出现`TypeError: 'list' object cannot be interpreted as an integer`问题 2. 还有怎么能让输出为数字，比如电话号码 ### 代码 ```python funasr_model= AutoModel(model="./FunASR/paraformer-zh", vad_model="./FunASR/fsmn-vad", punc_model="./FunASR/ct-punc", device="cuda:3") filename = "./audio.wav" with open(filename, "wb") as f:...

ArboterJams

question

最新版funasr,按照教程fineturn 模型报错：forward() missing 4 required positional arguments: 'speech', 'speech_lengths', 'text', and 'text_lengths'

## 🐛 Bug：按照教程fineturn 模型报错：forward() missing 4 required positional arguments: 'speech', 'speech_lengths', 'text', and 'text_lengths' ### To Reproduce 按照教程https://github.com/alibaba-damo-academy/FunASR/blob/main/examples/industrial_data_pretraining/paraformer/README_zh.md fineturn 模型步骤： 1. cd examples/industrial_data_pretraining/paraformer 2. sh train_from_local.sh 3. train.jsonl 和 val.jsonl...

Xsx93

bug

求教tokens.json的生成和问题？

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节） ## ❓ Questions and Help 看了一下modelscope的越南语UniASR的预训练模型的token.json文件，里面的token有大量包含了 "", "", "-@@", "'", "'@@", "a", "a@@", token文件里面的@符号有什么特殊的含义吗？我用aishell/paraformer/utils/text2token.py生成的token文件里面就没有@符号相关的

xiulianzw

question

funasr使用qwen-audio支持双卡进行测试？

## ❓ Questions and Help 如果使用单卡，T4会出现OOM，我有多张T4卡，应该怎么修改才可以使用多卡呢？ #### Code ``` from funasr import AutoModel model = AutoModel(model="Qwen-Audio", model_path="models/Qwen-Audio" ) audio_in = "39937061.wav" # 1st dialogue turn prompt = '这个内容是什么?' cache =...

lai-serena

question

uniasr方言模型与vad 、punc模型一起使用时的bug

5

## ❓ Questions and Help 在钉钉群沟通过了，说是uniasr模型适配的问题，希望尽快能够解决 ![1709105459028_F8371239-1841-4fd6-AB45-8B09CAD4AAE2](https://github.com/alibaba-damo-academy/FunASR/assets/74812416/00a2388d-a56e-424e-89e9-2722d975c810) #### What have you tried? 阅读源码，已确定是模型问题 #### What's your environment? - linux - funasr 1.0.10 - ModelScope Version 1.12.0

seanzhang-zhichen

question

关于使用越南语进行微调，不收敛的问题

Notice: In order to resolve issues more efficiently, please raise issue following the template. （注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）我使用了UniASR的实时越南语模型，https://modelscope.cn/models/iic/speech_UniASR_asr_2pass-vi-16k-common-vocab1001-pytorch-online/summary 因为UniASR下面目前还没更新微调和训练的脚本，我使用的是https://github.com/alibaba-damo-academy/FunASR/tree/main/examples/industrial_data_pretraining/paraformer，的微调脚本进行训练的，训练参数保持和脚本中的一致，改了一下预训练的模型为UniASR中的越南语模型，训练了100epoch之后，准确率一直保持在80%左右 ![image](https://github.com/alibaba-damo-academy/FunASR/assets/20657139/6dd93fc1-dcc5-4efb-bc77-3bd8fa9c4005) ## ❓ Questions and Help 我是否需要重新去生成tokens和cmvn文件进行重新训练？ ### Before asking: 1. search the...

xiulianzw

question

FunASR
FunASR copied to clipboard

Metadata

C#项目AliParaformerAsr.Examples是否可以支持时间戳或者标点符号

使用speaker diarization MISS错误率很高，请问是vad模块效果不好吗？还有结合视频的DER结果效果比单音频的还要差，请问这可以微调嘛？

重新开启服务端后发现客户端无法正常连接

生成标点包含、等时会出现字符长度与时间长度不相等

直接语音输入比较短的句子会出问题

最新版funasr,按照教程fineturn 模型报错：forward() missing 4 required positional arguments: 'speech', 'speech_lengths', 'text', and 'text_lengths'

求教tokens.json的生成和问题？

funasr使用qwen-audio支持双卡进行测试？

uniasr方言模型与vad 、punc模型一起使用时的bug

关于使用越南语进行微调，不收敛的问题

← Metadata

Owner

Metadata

FunASR FunASR copied to clipboard

Metadata

← Metadata

Owner

Metadata

FunASR
FunASR copied to clipboard