Hosang

Results 3 comments of Hosang

Closing this PR since the changes have already been implemented and merged via https://github.com/vllm-project/vllm/pull/18226/files

Hi @tlrmchlsmth thanks for letting me know about new V1 kernel. As you mentioned, there seem to be significant performance improvements in V1 due to the new prefill/decode method. As...

Closing this PR in favor of new v1 support: https://github.com/vllm-project/vllm/pull/17004