FunASR issues

mossformer2的微调支持

经过实验发现mossformer2的效果比mossformer要好不少，但目前似乎repo中还没有关于mossformer2的相关code，请问是否有计划release mossformer2以及其时间节点？

Running python client for 5.5 hours long audio recognition fails

1

OS: linux（CentOS Linux release 7.8.2003 (Core)） Python/C++ Version：Python-3.8.18 Package Version：pytorch-wpe（0.0.1）、torchaudio（2.1.0）、modelscope（1.9.2）、funasr（0.8.0） Model： speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx punc_ct-transformer_zh-cn-common-vocab272727-onnx speech_fsmn_vad_zh-cn-16k-common-onnx Command： python3 funasr_wss_client.py --host 127.0.0.1 --port 10095 --mode offline --audio_in /data/asr/temp/replay.1693548503.57554748.wav --send_without_sleep --output_dir /data/asr/funasr-runtime-resources/samples/python Details： The...

JacksonPan123

bug

离线环境下无法启动语音识别模型问题？（Unable to start the speech recognition model in an offline environment）

6

OS: linux Python 3.10 Package: pytorch 1.13.1 、modelscope1.3.0 funasr 0.3.0 Model: auto-speech-recognition Command： inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model=params["model_dir"], model_revision="v1.2.1", vad_model='damo/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v1.1.8", punc_model='damo/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v1.1.6", batch_size=64, ) Error log：我是本地环境下进行模型加载，本机能连接公网情况下，是能正常启动模型的，显示如下 ![image](https://github.com/alibaba-damo-academy/FunASR/assets/117268982/bb3ddca5-4a4c-4382-a048-4d22cf48136c) 但是在不能连接公网的服务器下，却启动不了，问题如下...

ljazbtzv99936

bug

不同词典大小的微调

1

请问下现在支持把pretrained模型在不同词典大小下进行微调吗？比如把原本词典大小8404的pretrained模型在词典大小200左右的数据集下进行微调

ruby11dog

duplicate

LoRA模块有bug

https://github.com/alibaba-damo-academy/FunASR/blob/4854d398708594a13e3043daf1a19adfde970ea2/funasr/modules/lora/layers.py#L216 参考loralib的issue: https://github.com/microsoft/LoRA/issues/34 推理的时候没有用到eval()，所以权重没有合并进去可以用他们的分支代码:https://github.com/microsoft/LoRA/tree/bugfix_MergedLinear 顺带着再贴个苏神的博客：https://kexue.fm/archives/9590

fclearner

bug

noise token

想添加noise token训练模型，在config.yaml里添加了添加了 ![企业微信截图_16958024425531](https://github.com/alibaba-damo-academy/FunASR/assets/42926130/2ae0a6d1-adf6-4a4e-8eef-1dab579fd02a) 并设置ignore_init_mismatch: true 添加noise token之后训练，请问text文件应该是什么样的？ ![企业微信截图_16958026149088](https://github.com/alibaba-damo-academy/FunASR/assets/42926130/71fe0e08-bd24-42c5-8c48-a7711155db05)

hzfei

only generate 2 words

3

I used the code provided in the tutorial but only generated the first two Chinese characters of the audio why?

TuuSiwei

使用paraformer进行ASR时无法正确获取模型

1

OS: Linux Python/C++ Version：python 3.9.17 Package Version： pytorch: 2.0.1 torchaudio: 2.0.2 modelscope: 1.9.0 funasr version（pip list）: 0.7.6 Model：speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch Command： python paraformer_infer.py --wav_dir xxxx --output_dir xxxx (就是单纯地调用了inference) Details： [code]: inference_pipeline =...

TristanLiu0101

funasr_wss_client 报错

1

```python python funasr_wss_client.py Namespace(audio_in=None, chunk_interval=10, chunk_size=[5, 10, 5], host='localhost', mode='2pass', output_dir=None, port=10095, send_without_sleep=True, ssl=1, thread_num=1, words_max_print=10000) Namespace(audio_in=None, chunk_interval=10, chunk_size=[5, 10, 5], host='localhost', mode='2pass', output_dir=None, port=10095, send_without_sleep=True, ssl=1, thread_num=1, words_max_print=10000) connect...

monkeycc

TN 错误

2

中文输入 `拨打 0551-4376729 购买彩票.` 当前输出 `拨打零五五一年四十三月七六七二九购买彩票.` 目标输出 `拨打零五五幺四三七六七二九购买彩票.`

lifeiteng

FunASR
FunASR copied to clipboard

Metadata

mossformer2的微调支持

Running python client for 5.5 hours long audio recognition fails

离线环境下无法启动语音识别模型问题？（Unable to start the speech recognition model in an offline environment）

不同词典大小的微调

LoRA模块有bug

noise token

only generate 2 words

使用paraformer进行ASR时无法正确获取模型

funasr_wss_client 报错

TN 错误

← Metadata

Owner

Metadata

FunASR FunASR copied to clipboard

Metadata

← Metadata

Owner

Metadata

FunASR
FunASR copied to clipboard