FunASR issues

funasr支持speech_campplus_speaker-diarization_common吗

2

funasr支持speech_campplus_speaker-diarization_common吗

question

在SenseVoice使用时候怎么设置需要标点，但不需要逆文本正则化

## ❓ 在SenseVoice使用时候怎么设置需要标点，但不需要逆文本正则化 ## Code ``` from funasr import AutoModel from funasr.utils.postprocess_utils import rich_transcription_postprocess model_dir = "iic/SenseVoiceSmall" model = AutoModel( model=model_dir, vad_model="fsmn-vad", vad_kwargs={"max_single_segment_time": 30000}, device="cuda:0", ) res = model.generate( input=f"{model.model_path}/example/en.mp3",...

zuiyuewentian

question

Paraformer语音识别-中文-通用-16k-离线-large-长音频版，微调量化导出后的模型文件替换docker中的模型文件后的效果与本地验证不一样

3

#### What is your question? Paraformer语音识别-中文-通用-16k-离线-large-长音频版（https://modelscope.cn/models/iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch），使用20小时语料进行微调，在微调服务器上完成量化导出和测试，测试效果很好，但是将量化后的权重文件，替换docker中对应的量化模型下的权重文件重启后，输出效果不如测试效果，请问是不是需要将依赖的vad、punc、lm模型也是用相同语料微调 #### Code #### What have you tried? 将模型字典配置文件全部同步到docker对应模型下，替换了相同文件，同时将docker中的长音频版量化模型，导入到服务器上使用微调量化后的权重文件替换，效果很好 #### What's your environment? - OS (e.g., Linux): - FunASR Version (e.g., 1.0.0): 1.0.12 - ModelScope...

chiyinbao

question

使用Whisper-large-v3和cam++的组合报错

1

from funasr import AutoModel import time wav_file = "/mnt/data/toolbox_dir/voice_trans/test-file/vad_example.wav" model = AutoModel( model="/mnt/data/toolbox_dir/voice_trans/Whisper-large-v3", vad_model="/mnt/data/toolbox_dir/voice_trans/speech_fsmn_vad_zh-cn-16k-common-pytorch", vad_kwargs={"max_single_segment_time": 30000}, punc_model="/mnt/data/toolbox_dir/voice_trans/punc_ct-transformer_cn-en-common-vocab471067-large", spk_model="/mnt/data/toolbox_dir/voice_trans/speech_campplus_sv_zh-cn_16k-common", device='cuda:2' ) start_time = time.time() res = model.generate( input=wav_file, batch_size_s=300, batch_size=1 )...

MJ666-K

question

1.2.4按官方示例在cuda上运行加上output_timestamp会报错

1

1.2.4按官方示例在cuda上运行加上output_timestamp会报如下错误（开启vad），如果跑在cpu上则不会，但是跑在cpu上有的长音频（大于30分钟）会出现timestamp长度和文本长度不一致的问题（在开启use_itn的时候），或者funasr/models/sense_voice/utils/ctc_alignment.py中_t_a_r_g_e_t_s_.size(-1) == 0的情况导致报错（在不开启use_itn的时候），望解决 ## 🐛 Bug /usr/local/lib/python3.10/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg,...

yqw1996

bug

发现了在LLM-ASR推理中存在的一些小问题

1

## ❓ Questions and Help #### What is your question? 在LLM-ASR任务中，我用默认whisper_qwen_linear.yaml训练aishell，训练了10个epoch，用best_model.pt进行inference。第一次：默认whisper_qwen_linear.yaml中有SpecAugLFR，因此在inference的时候经常出现无厘头的重复，频率很高。 e.g. BAC009S0768W0178 撇油加加撇油加加撇油加加撇油加加撇油加加撇油。。。第二次：删掉默认whisper_qwen_linear.yaml中所有的dropout和SpecAugLFR，重新训练以后，在inference无厘头的重复出现的概率降低了，但依然偶尔会有。问题转移成在inference是会在解码结果前面多出现一两个字。我已经检查了mask似乎没有什么问题，在推理时我也尝试禁止了prompt，结果似乎也没有变化。 e.g. BAC009S0766W0399 幢经过近两个星期的漫长等待 (经过近两个星期的漫长等待) 一些配置和sh文件我通过附件的形式发你： conf：[https://github.com/NiniAndy/FunASR/blob/mymerge/examples/industrial_data_pretraining/llm_asr/conf/whisper_qwen_linear.yaml](url) train.sh: [https://github.com/NiniAndy/FunASR/blob/mymerge/examples/industrial_data_pretraining/llm_asr/demo_train_or_finetune.sh](url) inference.sh: [https://github.com/NiniAndy/FunASR/blob/mymerge/examples/industrial_data_pretraining/llm_asr/infer_speech2text.sh](url) #### What's your environment?...

NiniAndy

question

log.txt里没有打印loss值是为什么呢？

按照官网上的finetune.sh文件进行微调，为什么在log.txt文件里没有出现loss值？

Starry-6w6

question

为什么uniasr不支持转onnx呢

现在只有uniasr的闽南语模型效果最好，用户基数也有，为什么不打算支持这个模型的端侧部署了呢

Geministudents

Issue with Multi-GPU Inference in Xinference Using vLLM for Model Loading

I am currently facing an issue with using multiple GPUs simultaneously when running inference on vLLM with Xinference. The setup works correctly when using a single GPU with smaller models,...

Bc-Aqr

question

为什么读取本地音频报错

4

from funasr import AutoModel from funasr.utils.postprocess_utils import rich_transcription_postprocess model_dir = "iic/SenseVoiceSmall" model = AutoModel( model=model_dir, vad_model="fsmn-vad", vad_kwargs={"max_single_segment_time": 30000}, device="cuda:0", ) # en res = model.generate( input=f"D:\\demo\demo_1\\recording.wav", cache={}, language="auto", # "zn",...

deict

question

FunASR
FunASR copied to clipboard

Metadata

funasr支持speech_campplus_speaker-diarization_common吗

在SenseVoice使用时候怎么设置需要标点，但不需要逆文本正则化

Paraformer语音识别-中文-通用-16k-离线-large-长音频版，微调量化导出后的模型文件替换docker中的模型文件后的效果与本地验证不一样

使用Whisper-large-v3和cam++的组合报错

1.2.4按官方示例在cuda上运行加上output_timestamp会报错

发现了在LLM-ASR推理中存在的一些小问题

log.txt里没有打印loss值是为什么呢？

为什么uniasr不支持转onnx呢

Issue with Multi-GPU Inference in Xinference Using vLLM for Model Loading

为什么读取本地音频报错

← Metadata

Owner

Metadata

FunASR FunASR copied to clipboard

Metadata

← Metadata

Owner

Metadata

FunASR
FunASR copied to clipboard