FunASR issues

Code Error: Incorrect Path of Wav in VAD Model Example

Location https://github.com/alibaba-damo-academy/FunASR/blame/6e86c5044d30dffe356b6e42838d01b7cfaf4272/README.md#L158C2-L158C3 The original code ```python wav_file = f"{model.model_path}/example/asr_example.wav" ``` I Guess You Meant ```python wav_file = f"{model.model_path}/example/vad_example.wav" ```

nuaalixu

self._handle = _dlopen(self._name, mode) OSError: [WinError 127] 找不到指定的程序

1

``` from funasr import AutoModel File "C:\Users\loong\.conda\envs\nlp\lib\site-packages\funasr\__init__.py", line 33, in from funasr.auto.auto_model import AutoModel File "C:\Users\loong\.conda\envs\nlp\lib\site-packages\funasr\auto\auto_model.py", line 19, in from funasr.utils.load_utils import load_bytes File "C:\Users\loong\.conda\envs\nlp\lib\site-packages\funasr\utils\load_utils.py", line 8, in import torchaudio...

lonngxiang

question

支持使用faster-whisper模型吗？

2

感谢开源优秀的工作，因为目前只支持中文和英文，其他语种的ASR模型打算使用faster-whisper模型，请问该项目是否支持使用faster-whisper模型？

aguang1201

question

粤语识别模型推理出错，是否有长音频的模型

1

系统：ubuntu22.04 版本信息： funasr==1.0.18，modelscope==1.11.1 推理代码： from modelscope.pipelines import pipeline from modelscope.utils.constant import Tasks inference_pipeline = pipeline( task=Tasks.auto_speech_recognition, model='damo/speech_UniASR_asr_2pass-cantonese-CHS-16k-common-vocab1468-tensorflow1-online',model_revision='v2.0.4', vad_model='iic/speech_fsmn_vad_zh-cn-16k-common-pytorch', vad_model_revision="v2.0.4", vad_kwargs={"max_single_segment_time": 60000}, punc_model='iic/punc_ct-transformer_zh-cn-common-vocab272727-pytorch', punc_model_revision="v2.0.4", ) rec_result = inference_pipeline(input='./0325.wav') print(rec_result[0]) 问题：0325.wav该音频时长4分钟，推理出错，取前10s钟能正常推理错误信息：...

WuerLei

bug

Paraformer、SeacoParaformer onnx 导出希望支持 timestamp 输出

2

现在似乎导出后没有时间戳

yantang213

question

端侧化，paraformer转RK框架，跪求fp16模型或者finetune训练方案

6

背景：我在探索paraformer在端侧上部署方法，我希望通过RK框架调用NPU进行推理。RK框架只支持fp16精度的模型进行推理。 FP16的表示范围[-65504 ~ 66504]，FP32表示范围[-3.4×10^{38},3.4×10^{38}],因此FP32模型直接转RK模型，在推理过程中会出现溢出（NAN）。我采用了FUNASR教程：https://github.com/alibaba-damo-academy/FunASR/blob/v0.8.8/funasr/export/README.md ，进行INT8量化，然而该方案是动态量化，在真正计算时仍会逆量化为fp32。问题：请问是否有真正的fp16模型或者finetune训练方案？

Xsx93

question

assert x.size(2) == self.size AssertionError

2

https://github.com/alibaba-damo-academy/FunASR/issues/1478 https://www.modelscope.cn/models/dengcunqin/speech_paraformer-large_asr_nat-zh-cantonese-en-16k-vocab8501-online/summary model_name_or_model_dir="dengcunqin/speech_paraformer-large_asr_nat-zh-cantonese-en-16k-vocab8501-online" model_revision="master" torchrun \ --nnodes 1 \ --nproc_per_node ${gpu_num} \ funasr/bin/train.py \ ++model="${model_name_or_model_dir}" \ ++model_revision="${model_revision}" \ ++train_data_set_list="${train_data}" \ ++valid_data_set_list="${val_data}" \ ++dataset_conf.batch_size=64 \ ++dataset_conf.batch_type="token" \ ++dataset_conf.num_workers=4 \ ++train_conf.max_epoch=50 \...

LRY1994

question

AliFsmnVad实例的GetSegments函数，返回的时间比实际时长多

1

复现条件，直接运行起来，通过AliFsmnVad实例的GetSegments函数，返回的时间比实际时长多 # #环境 -操作系统(如windows11): FunASR版本最新版 Microsoft.ML.OnnxRuntime 最新版本: ![6e1c456dab532002bf6dd1b234f9f673](https://github.com/alibaba-damo-academy/FunASR/assets/35184007/e3a77313-568d-4038-8c25-1378df1d1b10) ![40dc654dea091a3b153bda89b35c4859](https://github.com/alibaba-damo-academy/FunASR/assets/35184007/ec5ad7bc-0491-4c65-9b54-8ef01b32aed3) 测试音频为speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch\example提供的asr_example.wav 下载 https://www.modelscope.cn/models/iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch/file/view/master/example%2Fasr_example.wav?status=0

greatbag

bug

fsmn vad 如何精确定位句尾？

1

在我的音频中老师的提问和学生的回答时间间隔为1秒左右，我减少max_end_silence_time为500ms尝试精确定位句尾，但是没有效果，无法精确分离老师和学生的话，请问还可以尝试什么配置呢？ **vad_model模型配置如下：** ``` frontend: WavFrontendOnline frontend_conf: fs: 16000 window: hamming n_mels: 80 frame_length: 25 frame_shift: 10 dither: 0.0 lfr_m: 5 lfr_n: 1 model: FsmnVADStreaming model_conf: sample_rate: 16000 detect_mode: 1...

bigmisspanda

question

Some warnings for training

4

用finetune.sh在自有数据集上微调下载的speech_paraformer_asr_nat-zh-cn-8k-common-vocab8358-tensorflow1 模型时，有一些warning，如下： grad.sizes() = [1, 320], strides() = [1, 1] bucket_view.sizes() = [1, 320], strides() = [320, 1] (Triggered internally at ../torch/csrc/distributed/c10d/reducer.cpp:325.) Variable._execution_engine.run_backward( # Calls into the C++ engine to...

dospeech

question

FunASR
FunASR copied to clipboard

Metadata

Code Error: Incorrect Path of Wav in VAD Model Example

self._handle = _dlopen(self._name, mode) OSError: [WinError 127] 找不到指定的程序

支持使用faster-whisper模型吗？

粤语识别模型推理出错，是否有长音频的模型

Paraformer、SeacoParaformer onnx 导出希望支持 timestamp 输出

端侧化，paraformer转RK框架，跪求fp16模型或者finetune训练方案

assert x.size(2) == self.size AssertionError

AliFsmnVad实例的GetSegments函数，返回的时间比实际时长多

fsmn vad 如何精确定位句尾？

Some warnings for training

← Metadata

Owner

Metadata

FunASR FunASR copied to clipboard

Metadata

← Metadata

Owner

Metadata

FunASR
FunASR copied to clipboard