lkchen

Results 12 comments of lkchen

https://github.com/vllm-project/vllm/pull/17084 removed sampler from model, this PR needs rebase. Let me see if I can help

Hi @cjsdurj , may I ask how to produce before throughput of 2tk/s and after throughput of 136 tk/s ? I'm using https://github.com/lk-chen/vllm/pull/2 on L40S, forcing **vLLM v0, model=Qwen/Qwen2.5-1.5B-Instruct, async...