vllm
vllm copied to clipboard
[Usage]: How to change the batch size when testing the throughput of VLLM by running benchmark_throughput
Your current environment
The output of `python collect_env.py`
How would you like to use vllm
I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.