flash-linear-attention icon indicating copy to clipboard operation
flash-linear-attention copied to clipboard

[RFC] Unifying `chunk` and `fused_chunk` mode

Open sustcsonglin opened this issue 7 months ago • 0 comments

Proposal

The chunk and fused_chunk modes have complementary strengths in different scenarios. The interface should be unified so that the user is agnostic to the underlying implementation. The API should automatically determine whether to use chunk or fused_chunk mode based on the input shape.

Rationale

No response

sustcsonglin avatar Mar 14 '25 21:03 sustcsonglin