TensorRT-LLM
TensorRT-LLM copied to clipboard
How to set `top_p` 'top_k' arguments in `gptManagerBenchmark`?
What is the sampling strategy in gptManagerBenchmark?
gptManagerBenchmark does not support specifying sampling strategy yet, and it's using default top_p and top_k, which is top_p=0.0 and top_k=1.