Kaiyu Xie comments

Repositories
Issues
Comments

Results 71 comments of


                                            Kaiyu Xie

Executor API: How to get throughput

> Thanks [@juney-nvidia](https://github.com/juney-nvidia) how do I use`trtllm.KvCacheConfig`with` trtllm-bench`? @khayamgondal There is a [`--extra_llm_api_options`](https://github.com/NVIDIA/TensorRT-LLM/blob/8bb3eea285db15c3b54c66230eb2701505fc863f/tensorrt_llm/bench/benchmark/throughput.py#L49) argument provided by `trtllm-bench` that allows you to specify any custom configurations following [`LlmArgs`](https://github.com/NVIDIA/TensorRT-LLM/blob/8bb3eea285db15c3b54c66230eb2701505fc863f/tensorrt_llm/llmapi/llm_args.py#L760) data structure, and...