vllm
vllm copied to clipboard

Published 20 hours ago •

Reame
Issues

[V1][WIP] 2nd try of Hybrid allocator for full attention & sliding window attention interleaved models

Open heheda12345 opened this issue 2 weeks ago • 3 comments

Trying another implementation of #12655

Feb 14 '25 16:02 heheda12345