Hosang
Results
3
comments of
Hosang
Closing this PR since the changes have already been implemented and merged via https://github.com/vllm-project/vllm/pull/18226/files
Hi @tlrmchlsmth thanks for letting me know about new V1 kernel. As you mentioned, there seem to be significant performance improvements in V1 due to the new prefill/decode method. As...
Closing this PR in favor of new v1 support: https://github.com/vllm-project/vllm/pull/17004