TensorRT-LLM
TensorRT-LLM copied to clipboard
What's the throughput of R1 671B using bs=1 without quant?
For h200, what's the throughput of R1 671B using bs=1 without quant?