Slyne Deng comments

Results 47 comments of


                                            Slyne Deng

torch batch forward

Onnx can support negative index. JIT maybe not ? You may use other value instead of -1 in this case.

Can i use gpu for inferenc

Do you want to do offline or online inference ? You may take a reference from: https://github.com/wenet-e2e/wenet/tree/main/runtime/server/x86_gpu

Auto-generated encoder model config

> I think I don't get your point. > > Tritonserver supports Auto-Generated Model Configuration. So I just remove the output in the template. > > Do you mean that...

Auto-generated encoder model config

> Why not use HTTP API? > > ```shell > curl localhost:8000/v2/models//config > ``` > > Then we can completely avoid processing the model details. This suggestion is great! We...

语言模型解码错误

> 补充一下，插值之后是报了这个错：line 7: warning: non-zero probability for in closed-vocabulary LM BOW numerator for context "" is -0.4 < 0 请尽量用kenlm来插值

> 遇到了同样的问题，我的arpa是用srilm构建的，中文是以字为建模单元，但是英文不是以字母为建模单元，而是用的BPE（空格是▁），加语言模型之后性能下降特别明显，不仅英文有问题，中文也有很大的问题。 > > 我之前看到了#9 这个issue，但并没有明确get到解决办法。我现在想的是直接在构建arpa的时候去解决这个问题，就是直接用kenlm并且将语料的英文字母也用空格分开去构建arpa，但这就引入一个问题：原本英文中的空格怎么办呢？能想到的一个办法是先将与语料中的空格转成其他字符比如“|”构建好arpa，然后一遍识别结果中也将空格替换成“|”，再用arpa重打分，最后再把“|”换回空格，这个方法比较麻烦。 @Slyne 你这边有什么好的办法嘛？ @duj12 谢谢，我确实一直没有加BPE，并且只支持kenlm. 如果是workaround，你可以先尝试CUDA WFST解码: https://github.com/wenet-e2e/wenet/tree/main/runtime/gpu/cuda_decoders 这个也可以直接pip安装 https://github.com/nvidia-riva/riva-asrlib-decoder 你可以分享一下你的应用是离线ASR还是流式的？我这里可以先: 1. 先加BPE解码，没有语言模型 2. 加语言模型 3. 直接把wenet wfst解码拿过来然后包装一下

Slyne Deng

torch batch forward

Can i use gpu for inferenc

Auto-generated encoder model config

Auto-generated encoder model config

语言模型解码错误

语言模型解码错误

adding inference_mode() causes "Inference tensors do not track version counter"

mixtral evaluation PR

mixtral evaluation PR

Small fix for Mixtral as judge eval pipeline