FunASR
FunASR copied to clipboard
为什么gpu版本没有流式回复了,识别速度很慢
版本:funasr:funasr-runtime-sdk-gpu-0.2.0 模式:2pass 启动方式:
nohup bash run_server.sh \
--download-model-dir /workspace/models \
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx \
--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch \
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx \
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst \
--itn-dir thuduj12/fst_itn_zh \
--certfile 0 \
--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &
python客户端代码:
# create an recognizer
rcg=Funasr_websocket_recognizer(host="127.0.0.1",port="10098",is_ssl=False,mode="2pass",chunk_size="5, 10, 5",chunk_interval=2)
# loop to send chunk
for i in range(chunk_num):
beg = i * stride
data = audio_bytes[beg:beg + stride]
text=rcg.feed_chunk(data,wait_time=0.02)
if len(text)>0:
print("text",text)
time.sleep(0.05)
# get last message
text=rcg.close(timeout=3)
print("text",text)
这里出来一整段语音识别,没有流式语音了
connect to url ws://127.0.0.1:10098
send json {"mode": "2pass", "chunk_size": [10, 20, 10], "encoder_chunk_look_back": 4, "decoder_chunk_look_back": 1, "chunk_interval": 2, "wav_name": "default", "is_speaking": true}
text {'is_final': False, 'mode': 'offline', 'stamp_sents': [{'end': 6865, 'punc': ',', 'start': 390, 'text_seg': '请 吃 这 家 伙 就 是 那 种 一 边 玩 蚊 子 游 戏', 'ts_list': [[390, 550], [550, 995], [1710, 2050], [2050, 2430], [2430, 3030], [3690, 3970], [3970, 4230], [4230, 4450], [4450, 4730], [4730, 4910], [4910, 5210], [5210, 5590], [5590, 5850], [5850, 6170], [6170, 6450], [6450, 6865]]}, {'end': 10695, 'punc': '。', 'start': 7600, 'text_seg': '一 边 等 着 你 上 钩 的 感 觉', 'ts_list': [[7600, 7759], [7759, 8139], [8139, 8400], [8400, 8639], [8639, 8900], [8900, 9219], [9219, 9519], [9519, 9719], [9719, 10040], [10040, 10695]]}], 'text': '请吃这家伙就是那种一边玩蚊子游戏,一边等着你上钩的感觉。', 'timestamp': '[[390,550],[550,995],[1710,2050],[2050,2430],[2430,3030],[3690,3970],[3970,4230],[4230,4450],[4450,4730],[4730,4910],[4910,5210],[5210,5590],[5590,5850],[5850,6170],[6170,6450],[6450,6865],[7600,7759],[7759,8139],[8139,8400],[8400,8639],[8639,8900],[8900,9219],[9219,9519],[9519,9719],[9719,10040],[10040,10695]]', 'wav_name': 'default'}
client closed
同问
我也是用的这一版镜像,用run_server_2pass.sh起的服务,启动命令如下: /workspace/FunASR/runtime/websocket/build/bin/funasr-wss-server-2pass --download-model-dir /workspace/models --model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx --online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx --vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx --punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx --itn-dir thuduj12/fst_itn_zh --lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst --decoder-thread-num 48 --model-thread-num 1 --io-thread-num 3 --port 10097 --certfile /workspace/FunASR/runtime/ssl_key/server.crt --keyfile /workspace/FunASR/runtime/ssl_key/server.key --hotword /workspace/FunASR/runtime/websocket/hotwords.txt 用python脚本测试的,gpu利用率也是没有变化,是流式识别不支持gpu部署吗
我也遇到这个问题了,启动命令如下:
bash run_server.sh
--download-model-dir /workspace/models
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx
--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch
--punc-dir damo/punc_ct-transformer_cn-en-common-vocab471067-large-onnx
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst
--itn-dir thuduj12/fst_itn_zh
--hotword /workspace/models/hotwords.txt
--certfile 0
每次断开连接的时候服务端才开始识别