FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Results 484 FunASR issues
Sort by recently updated
recently updated
newest added

我在 https://github.com/alibaba-damo-academy/FunASR/issues/908#issuecomment-1756610416 中看到回复说更新到 update modelscope-1.9.2后修复这个问题, 但是我在modelscope 1.9.4还是遇到了这个问题 asr = pipeline( task=Tasks.auto_speech_recognition, model="./speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404", param_dict['hotword'] = "https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/hotword.txt" device='cuda', ) 我已将模型下载到本地路径,但启动时还是需要网络重新下载一遍。 当~/.cache/modelscope/hub/damo/中有模型文件目录时,不会出现这种情况,但我在离线环境下无法将文件下载到cache目录

跑speaker_diarization任务的时候发现分割出的时间大于输入音频的最大长度 ```python dinference_diar_pipline = pipeline( mode="sond_demo", num_workers=0, task=Tasks.speaker_diarization, diar_model_config="sond.yaml", model='damo/speech_diarization_sond-zh-cn-alimeeting-16k-n16k4-pytorch', model_revision="v1.0.5", sv_model="damo/speech_xvector_sv-zh-cn-cnceleb-16k-spk3465-pytorch", sv_model_revision="v1.2.2", ) audio_list=[ "../2.wav", "../spk1.wav", "../spk2.wav", "../spk3.wav", "../spk4.wav", ] results = inference_diar_pipline(audio_in=audio_list) print(results) {'text': 'spk1 [(0.0, 18.8), (55.36,...

--audio_in (**required**) the wav or pcm file path 这个怎么是必须的啊。

转来的峰哥音频,很多额,嗯的连词实际上是一整句话,被切成了好几个碎段。 有啥办法可以连在一起啊。

Our server is running on Ubuntu with docker (funasr:funasr-runtime-sdk-online-cpu-0.1.5) provided using command: `nohup bash run_server_2pass.sh \ --model-dir damo/speech_paraformer-large-contextual_asr_nat-zh-cn-16k-common-vocab8404-onnx \ --online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx \ --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \ --punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx \ --itn-dir thuduj12/fst_itn_zh...

- FunASR 0.8.7 如下面代码的赋值,即使之前使用了`cache_dir=args.export_dir`也是无效的,值也会被覆盖。 https://github.com/alibaba-damo-academy/FunASR/blob/d2266616a142b7f57f29f1a840458a9865f953ac/funasr/export/export_model.py#L180

bug

File "C:\miniconda\Lib\site-packages\funasr\bin\diar_inference_launch.py", line 140, in _forward assert all([len(example) >= 2 for example in raw_inputs]), \ AssertionError: The length of test case in raw_inputs must larger than 1 (>=2). 必须输入两个文件?

配置文件里面是有这个do_end_point_detection选项的,但我看e2e_vad.py中的代码,什么都没做直接pass了,所以比较好奇是其他地方做了,还是在计划内但还没有支持,提前感谢大佬解惑 '''vad.yaml vad_post_conf: sample_rate: 8000 detect_mode: 1 snr_mode: 0 max_end_silence_time: 800 max_start_silence_time: 3000 do_start_point_detection: True do_end_point_detection: False ''' 代码位置: https://github.com/alibaba-damo-academy/FunASR/blob/172e7ac986f299ad545cbd91a8cecc3ef967af36/funasr/models/e2e_vad.py#L414

请问CT-transformer多任务训练时的数据集是一套数据吗,也就是一条数据对应两个任务的标签。还是说两个任务的数据都是不同的呢,另外如果数据不同的话,两个任务的数据量是否相同呢

###再Linux环境中 1、再容器内启动时通过 --model-dir damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn 指定 2、使用脚本方式更新 bash funasr-runtime-deploy-offline-cpu-zh.sh update --asr_model damo/speech_paraformer-large-vad-punc-spk_asr_nat-zh-cn (--vad_model 的参数也试过了) ###错误信息1: Traceback (most recent call last): File "/workspace/FunASR/funasr/utils/runtime_sdk_download_tool.py", line 23, in model_dir = snapshot_download(args.model_name, cache_dir=args.export_dir, revision=args.model_revision) File...