Results 47 comments of Slyne Deng

Onnx can support negative index. JIT maybe not ? You may use other value instead of -1 in this case.

Do you want to do offline or online inference ? You may take a reference from: https://github.com/wenet-e2e/wenet/tree/main/runtime/server/x86_gpu

> I think I don't get your point. > > Tritonserver supports Auto-Generated Model Configuration. So I just remove the output in the template. > > Do you mean that...

> Why not use HTTP API? > > ```shell > curl localhost:8000/v2/models//config > ``` > > Then we can completely avoid processing the model details. This suggestion is great! We...

> 补充一下,插值之后是报了这个错:line 7: warning: non-zero probability for in closed-vocabulary LM BOW numerator for context "" is -0.4 < 0 请尽量用kenlm来插值

> 遇到了同样的问题,我的arpa是用srilm构建的,中文是以字为建模单元,但是英文不是以字母为建模单元,而是用的BPE(空格是▁),加语言模型之后性能下降特别明显,不仅英文有问题,中文也有很大的问题。 > > 我之前看到了#9 这个issue,但并没有明确get到解决办法。 我现在想的是直接在构建arpa的时候去解决这个问题,就是直接用kenlm并且将语料的英文字母也用空格分开去构建arpa,但这就引入一个问题:原本英文中的空格怎么办呢?能想到的一个办法是先将与语料中的空格转成其他字符比如“|”构建好arpa,然后一遍识别结果中也将空格替换成“|”,再用arpa重打分,最后再把“|”换回空格,这个方法比较麻烦。 @Slyne 你这边有什么好的办法嘛? @duj12 谢谢,我确实一直没有加BPE,并且只支持kenlm. 如果是workaround,你可以先尝试CUDA WFST解码: https://github.com/wenet-e2e/wenet/tree/main/runtime/gpu/cuda_decoders 这个也可以直接pip安装 https://github.com/nvidia-riva/riva-asrlib-decoder 你可以分享一下你的应用是离线ASR还是流式的? 我这里可以先: 1. 先加BPE解码,没有语言模型 2. 加语言模型 3. 直接把wenet wfst解码拿过来 然后包装一下

> From https://pytorch.org/docs/stable/generated/torch.inference_mode.html > > > Code run under this mode gets better performance by disabling view tracking and version counter bumps > > Looks like we need alternatives to...

@ethanhe42 too messy to fix the signature issue for the previous pr: https://github.com/NVIDIA/NeMo/pull/8988 So restart a new one. LOL..

@ethanhe42 Done.

@PannuMuthu There's one commit not signed off. Please help check. @PannuMuthu @ethanhe42 Please review. Thanks!