FunASR icon indicating copy to clipboard operation
FunASR copied to clipboard

2pass模式下,部署funasr的英文识别结果,最后两个单词之间没有空格

Open chenpaopao opened this issue 1 year ago • 4 comments

Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)

🐛 Bug

Bug1

部署 funasr cd FunASR/runtime nohup bash run_server_2pass.sh
--download-model-dir /workspace/models
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx
--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst
--itn-dir thuduj12/fst_itn_zh
--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

英文识别最后两个单词的结果之间没有空格: 比如:i can you hearme。 i just want to turn to a few ofthem。

个人怀疑是runtime/onnxruntime/src/ct-transformer-online.cpp 下面的代码有问题: vector WordWithPunc; for (int i = 0; i < sentence_words_list.size(); i++) // for i in range(0, len(sentence_words_list)): { if (!(sentence_words_list[i][0] & 0x80) && (i + 1) < sentence_words_list.size() && !(sentence_words_list[i + 1][0] & 0x80)) { sentence_words_list[i] = " " + sentence_words_list[i]; }

bug2:

麻烦问下 punc_ct-transformer_cn-en-common-vocab471067-large-onnx 模型能否用在 run_server_2pass.sh 里面,我替换后,启动服务失败,会有error: websocket-server-2pass.cpp:586 index out of range

chenpaopao avatar Sep 09 '24 12:09 chenpaopao

解决了没? 遇到同样的问题

80boys avatar Sep 10 '24 06:09 80boys

It is a bug.

LauraGPT avatar Sep 14 '24 07:09 LauraGPT

I came across the same bug. Any solution now?

AlvinAi96 avatar Sep 29 '24 09:09 AlvinAi96

这个问题很影响实际使用,想请问一下目前有什么进度吗?我通过连接wss://www.funasr.com:10095/发现该问题在官方的demo中已经被修复了,但最新的0.1.11版本问题还存在

lin-xiaosheng avatar Oct 15 '24 09:10 lin-xiaosheng