[Usage]: How to change the batch size when testing the throughput of VLLM by running benchmark_throughput

Open Ourspolaire1 opened this issue 1 year ago • 0 comments

The output of `python collect_env.py`

I want to run inference of a [specific model](put link here). I don't know how to integrate it with vllm.

May 13 '24 09:05 Ourspolaire1