FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Results 484 FunASR issues
Sort by recently updated
recently updated
newest added

注意,本教程是完全基于FunASR进行标点模型微调与onnx模型导出,不涉及modelscope。 ### 1. 标点模型训练 标点模型训练与微调借鉴 FunASR/egs/aishell2这个例子进行,具体如下: **1) 下载标点预训练模型文件夹 punc_ct-transformer_zh-cn-common-vocab272727-pytorch 到本地 FunASR/egs/aishell2 目录下。** **2)FunASR/egs/aishell2 目录下新建 tokenize_text.py 文件用于进行文本和标点处理,主要是根据预训练模型punc_ct-transformer_zh-cn-common-vocab272727-pytorch 文件夹中的 punc.yaml 配置文件对输入文本进行文字和标点提取。可以应用 WeTextProcessing 工具包进行文本正则化,也可以利用FunASR自带的正则化脚本进行处理。** tokenize_text.py 脚本如下: ``` #!/usr/bin/env python3 import argparse from collections...

CONTROLLABLE TIME-DELAY TRANSFORMER FOR REAL-TIME PUNCTUATION PREDICTION AND DISFLUENCY DETECTION 提出联合建模标点预测和 disfluency detection,请问代码里面哪里有disfluency detection的部分?

我在公开的Paraformer模型finetune了几万小时数据,训练之后结果有英语的数据识别结果丢字很严重。text文件中中文之间、中英之间、英文之间都手动加了空格。(如果不手动添加空格,finetune会跑飞,貌似preprocess并没有起作用,且config.yaml文件里use_preprocessor: true)。 ![image](https://github.com/alibaba-damo-academy/FunASR/assets/87749363/273980ce-a135-4eeb-a94b-b8579dfa7c50)

OS: linux(CentOS Linux release 7.8.2003 (Core)) Python/C++ Version:Python-3.8.18 Package Version:pytorch-wpe(0.0.1)、torchaudio(2.1.0)、modelscope(1.9.2)、funasr(0.8.0) 16-core vCPU, 32G memory Model: speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx punc_ct-transformer_zh-cn-common-vocab272727-onnx speech_fsmn_vad_zh-cn-16k-common-onnx Command: nohup bash run_server.sh \ --download-model-dir /workspace/models \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --model-dir...

bug

linux:Ubuntu 20.04.4 python=3.8.18 torch=2.0.1 funasr=0.8.2 modelscope=1.9.3 训练e-branchformer模型的时候,在130个epoch的时候强制停止训练,然后使用valid.acc.best.pb文件去识别测试集,得到了全是,将测试集换成训练集一样全是,是什么原因,该怎么去解决? 最后一个完整的epoch的训练日志: [autodl-container-28da11ab52-63ba08d6] 2023-11-28 19:14:21,657 (build_trainer:248) INFO: 132/180epoch started. Estimated time to finish: 3.555612834232201 hours [autodl-container-28da11ab52-63ba08d6] 2023-11-28 19:14:36,118 (build_trainer:730) INFO: 132epoch:train:1-50batch:136814num_updates: iter_time=0.041, forward_time=0.092, loss_ctc=2.658,...

OS: centos7.9 x86_64 Python/C++ Version:python3.8 Package Version:pytorch-0.0.1、torchaudio-2.1.0、modelscope-1.9.4、funasr-0.8.4 Model:speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-pytorch Command:python finetune.py Details:GPU:3090 24G显存,训练数据大概三万小时,finetune参数如下: params.dataset_type = "large" params.batch_bins = 130000 params.max_epoch = 20 params.lr = 0.0007 问题描述: 第一个batch时acc为0.29,目前到第281150batch时acc只有0.086(训练两天多),是数据问题吗? INFO: 1epoch:train:1-50batch:50num_updates: iter_time=0.058, forward_time=0.372,...

基于FunASR/egs进行模型训练,得到一个适用于小语种ASR的paraformer模型,然后将此模型训练得到的的模型文件和config文件替换掉damo/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch 下的模型文件与config文件,接下来使用如下代码进行推理: from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='models_from_modelscope/damo1/speech_paraformer_asr_nat-zh-cn-16k-aishell1-vocab4234-pytorch', ) rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav') print(rec_result) **返回结果为空值!!!** ![image](https://github.com/alibaba-damo-academy/FunASR/assets/38098690/4d40d17f-c2bf-4289-9051-f3db8c8e6310) ![05(1)](https://github.com/alibaba-damo-academy/FunASR/assets/38098690/17ed9ad2-df1c-4313-a470-be5e15f31251) ![03](https://github.com/alibaba-damo-academy/FunASR/assets/38098690/730396ed-46e6-40c4-8149-856de9464955)

请问如何部署在安卓设备上? 后面会支持通过mnn部署吗?

![image](https://github.com/alibaba-damo-academy/FunASR/assets/28730421/fb488ab8-b672-4d67-9cbd-cf8bbb53a5cd) 环境:linux python=3.9.0 torch=2.1.1 funasr=0.8.4 modelscope=1.9.1