Kaiyu Xie comments

Results 71 comments of


                                            Kaiyu Xie

Fix typo in examples/whisper, Fix examples/whisper/run_faster_whisper.py

@Pzzzzz5142 Thanks for your contribution, we've integrated your fixed in the internal codebase, which will be updated in the next push to the GitHub main branch. We'll credit you as...

Finding protobuf files while benchmarking TensorRT-LLM

Hi @KuntaiDu , please see the official location of the `config.pbtxt` file for v0.11 at here: https://github.com/triton-inference-server/tensorrtllm_backend/tree/v0.11.0/all_models/inflight_batcher_llm. Before you launch the tritonserver, you'll need to set several parameters, please follow...

chore: reference URL refactor

@A-transformer Can you please also help us understand a little bit more on why is this change necessary? Thanks!

fix: gpus_per_node in trtllm-bench when world_size < device_count

/bot run --add-multi-gpu-test

[Bug] Lookahead decoding is nondeterministic and wrong after the first call to runner.generate

Hi @tloen , the issue should be addressed after [this PR](https://github.com/NVIDIA/TensorRT-LLM/pull/2333), can you please try and see if that solves the problem? Feel free to let us know if there...

How to identify the rest toke latency?

Hi @RobinJYM , `generation_time` here means latency of generation stage, so if I understand the question correctly, if you want the latency of "rest tokens apart from the first token",...

Add composite metrics for kubernetes inference gateway metrics protocol

> @kaiyux Could you advise what would be the approach for external contribution here? Since we do not switch to GitHub development for this repo yet, we'll need someone to...

Add composite metrics for kubernetes inference gateway metrics protocol

> Thanks @kaiyux! I can help with integrating to the internal repo once the changes are finalized. What steps need to be taken to properly credit the contributor? We do...

Request for Reproduction Configuration of DeepSeek-R1 on H200 & B200

Hi @xwuShirley, thanks for your attention. There are some changes we haven't update to the main branch yet, we will keep you posted.

doc: [TRTLLM-325]Integrate the NGC image in Makefile automation and document

/bot run --skip-test