lkchen
Results
12
comments of
lkchen
https://github.com/vllm-project/vllm/pull/17084 removed sampler from model, this PR needs rebase. Let me see if I can help
Hi @cjsdurj , may I ask how to produce before throughput of 2tk/s and after throughput of 136 tk/s ? I'm using https://github.com/lk-chen/vllm/pull/2 on L40S, forcing **vLLM v0, model=Qwen/Qwen2.5-1.5B-Instruct, async...