Results 82 comments of Yuekai Zhang

> riton service and send requests to count the time spent on the encoder, decoder, and joiner modules. I found How do you count the time for triton modules? The...

> 您好,像这样手写推理会比onnx+tensorrt推理快么?感觉onnx+不同的Providers方式,更加合理些。 cpu上的话,不确定这个手写推理有没有onnx快。gpu上的话,手写推理是最快的,也就是FasterTransformer这种形式,利用onnx去支持gpu的推理,远没有手写的快。 onnx + tensorrt的话,只要手写的没有大问题,一般也是手写的快。这也是为啥会有fastertransformer这种项目

> Current perf Cuda vs Trt > > csrc/online-zipformer2-transducer-model.cc:RunEncoder:445 Encoder Duration : **1.930044** ms csrc/online-zipformer2-transducer-model.cc:RunEncoder:445 Encoder Duration : **0.034984** ms csrc/online-zipformer2-transducer-model.cc:RunEncoder:445 Encoder Duration : **0.034912** ms csrc/online-websocket-server-impl.cc:Run:256 Warm up completed...

@76176235 flake8 ./runtime/gpu/model_repo/feature_extractor/1/model.py:192:25: W292 no newline at end of file Could you fix the above CI/CD flake8 issue? Also, would you mind also updating knf pip install here https://github.com/wenet-e2e/wenet/blob/main/runtime/gpu/Dockerfile/Dockerfile.server#L6?

> Which models are you using? Unfortunately, this porting requires manual implementation, and degradations would happen due to this. > > In addition, to debug all functions, we are also...

docker run --gpus all -v : --name tts_server --net host -it tts_server:latest 比如你想把 host机器 /home/wjmessi1 的路径映射到 docker container /workspace/mytest 里头, 可以通过 -v 这个 option

你好,这个没有测试过,你如果有兴趣的话,我可以帮助你把这个跑起来

@aleksandr-smechov https://github.com/k2-fsa/sherpa/tree/master/triton/whisper Have you tried this triton python_backend + whisper trt-llm? If @piotrm-nvidia would like to accept PR, I'd love to prepare a pytriton whisper tensorrt-llm recipe under pytriton/example.

> Thanks for your really impressive work. > > I was wondering how to extract the token probability with TensorRT (a little bit lit what you did in this example...

@shashikg It's a very nice and excited project. If you're not in a hurry, I will be available to help you after 20 days. One concern is that, with the...