wwfcnu comments

Results 106 comments of


                                            wwfcnu

语音识别任务，输入文件的格式

> @wwfcnu 我也问过类似的问题，目前funasr还不支持多种格式的文件，唯一的解决办法是通过类似sox或者ffmpeg把他们转换成单通道、16000hz的wav文件。注意转换后的wav文件一定要是16000hz的，否则识别效果会大打折扣。再源码里加上识别mp3格式，识别的时候是不是会快一些

> 是支持的，需要升级到最新版本，升级指令： pip install "modelscope[audio_asr]" --upgrade -f https://modelscope.oss-cn-beijing.aliyuncs.com/releases/repo.html git clone https://github.com/alibaba/FunASR.git && cd FunASR pip install --editable ./ > > 包括mp3在内的多种音频格式，采样率都是支持的，用户不需要关心输入音频格式，如果遇到不支持的音频，可以反馈一下，repo：https://github.com/alibaba-damo-academy/FunASR > > 欢迎加入funasr钉钉群沟通您遇到的问题：27215013275 主要是 torchaudio版本的问题，看能不能把源码处理语音的sox换成ffmpeg

modelscope 的pipeline 是否还不支持热词的热加载？

batch_size可以设置成5000这么大吗

online模型运行时间长了之后响应速度变慢

> 完整的跑了一下三个小时的模型，按照[这里的代码](https://github.com/alibaba-damo-academy/FunASR/blob/main/egs_modelscope/asr/paraformer/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online/demo_online_v2.py)来运行，打印了下执行次数和运行时间，完整代码如下 > > ``` > import os > import logging > import torch > import soundfile > > from modelscope.pipelines import pipeline > from modelscope.utils.constant import Tasks > from...

funasr在线服务

使用镜像funasr-runtime-sdk-online-cpu-0.1.5

funasr在线服务

@lyblsgo

funasr在线服务

> `2pass-online` for real-time recognition results and `2pass-offline` for 2-pass corrected recognition results https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/websocket_protocol_zh.md#%E4%BB%8E%E6%9C%8D%E5%8A%A1%E7%AB%AF%E5%BE%80%E5%AE%A2%E6%88%B7%E7%AB%AF%E5%8F%91%E6%95%B0%E6%8D%AE-1 只有这3种

how to convert LM's binary model to arpa file

KenLM doesn't seem to have a bin2arpa tool ---- Replied Message ---- | From | Aadarsh ***@***.***> | | Date | 09/28/2023 02:49 | | To | ***@***.***> | |...

Offline loading of pipeline ('NoneType' object has no attribute 'eval')

> 1. Edit `your/path/to/pyannote/speaker-diarization/config.yaml` > > ```yaml > pipeline: > name: pyannote.audio.pipelines.SpeakerDiarization > params: > clustering: AgglomerativeClustering > embedding: your/path/to/speechbrain/spkrec-ecapa-voxceleb # Folder, must contains `speechbrain` keyword. > embedding_batch_size: 32 >...

Why does each file have 4 elements, and why do 8 files have the same elements?

感觉这个库有点问题。。