flashinfer icon indicating copy to clipboard operation
flashinfer copied to clipboard

Sliding window attention

Open WoosukKwon opened this issue 11 months ago • 2 comments

While I saw this item in the roadmap, I'm wondering if this feature will be supported in the near future or not.

WoosukKwon avatar Mar 06 '24 05:03 WoosukKwon

I skipped the item because we don't need special support for SWA if we set page_size to 1 . For larger page_size, I think it's still necessary to have SWA support, added to v0.0.4 release plan.

yzh119 avatar Mar 06 '24 20:03 yzh119

@yzh119 Oh yes, we don't need a new kernel for decode. However, if I understand correctly, we need a new kernel for prefills?

WoosukKwon avatar Mar 07 '24 10:03 WoosukKwon

Sorry for the late reply, it was supported in v0.1.2: https://github.com/flashinfer-ai/flashinfer/releases/tag/v0.1.2.

yzh119 avatar Aug 31 '24 23:08 yzh119