FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.

Results 555 FunASR issues
Sort by recently updated
recently updated
newest added

![a60f3506d696a38b139115fecf78845](https://github.com/user-attachments/assets/0b640cb7-cbde-4b79-96c7-cb2fa8353fa9) 我用网页版本的测试"is final" 一直都是false 状态,导致我上传的音频只能一个解析以后,只能刷新页面。语音war 是我上传的文件

bug

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help 使用官网提供的python websocket脚本启动了一个服务,客户端访问超过两个小时之后,识别速度变得越来越慢,有大佬遇到过这个问题吗?

question

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## 🐛 Bug ### To Reproduce Steps to reproduce the behavior (**always include the command...

bug

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## 🐛 Bug ### To Reproduce Steps to reproduce the behavior (**always include the command...

bug

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ## ❓ Questions and Help 使用2pass模式时,虽然online速度快,但是识别效果相对于offline较差,因为使用offline的结果,但是offline的结果耗时较长,尝试使用这个API “OrtSessionOptionsAppendExecutionProvider_CUDA” 来使用CUDA,计算的确是在GPU上进行的,但是速度并没有什么太大变化。测试音频时一个不间断说话,长约10.84秒的音频,耗时需要1.5s以上。 请问有什么方法能加速offline模型的onnx的推理? 也尝试过离线的镜像,因为离线镜像是所有结果一起返回,不符合使用场景。据我的认知,离线镜像若使用GPU,则使用的libtorch;如果不使用GPU,则用的是onnxruntime,两者的确在速度上有明显的差别,将batch_size设置为1,libtorch解码之前提到的音频,比onnx要快约800ms ### Before asking: 1. search...

question

#### What is your question? 如何通过项目生成字幕文件? 比如:*vtt,srt,lrc* #### Code 在*rich_transcription_postprocess* 函数中没有发现关于时间的内容。

question

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节) ![image](https://github.com/user-attachments/assets/d941881e-1b69-41f1-a51a-a937b0ca4cb1) ## ❓ Questions and Help ### Before asking: 1. search the issues. 2. search...

question

```python model = AutoModel( model="FunAudioLLM/SenseVoiceSmall", vad_model="fsmn-vad", punc_model="ct-punc", spk_model="cam++", vad_kwargs={"max_single_segment_time": 15000}, batch_size=1, hub="hf", device=device, ) ``` console error => ```markdown ERROR:root:Only 'iic/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch' and 'iic/speech_seaco_paraformer_large_asr_nat-zh-cn-16k-common-vocab8404-pytorch' can predict timestamp, and speaker diarization relies...

bug

## 🐛 Bug 直接使用AutoModel 加载ct-punc 模型可以正确地为英文句子生成英文标点,但是按照教程使用funasr_onnx中的CT_Transformer无法生成英文标点。 ### To Reproduce AutoModel 正常生成英文标点 ```python from funasr import AutoModel model = AutoModel(model="ct-punc") model.generate(input='Hello world') # [{'key': 'rand_key_2yW4Acq9GFz6Y', # 'text': ' Hello world.', #...

bug