flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Inquiry about Turing GPU support progress in FlashAttention-2

Open wangsen-zy opened this issue 8 months ago • 1 comments

Dear FlashAttention Development Team,

First off, thank you so much for developing such an excellent library like FlashAttention!

I noticed in the project's README (https://github.com/Dao-AILab/flash-attention) that FlashAttention-2 currently supports Ampere, Ada, or Hopper GPUs. The README also mentions that "Support for Turing GPUs (T4, RTX 2080) is coming soon, please use FlashAttention 1.x for Turing GPUs for now."

I am currently still using a Turing architecture GPU (specifically an RTX 2080 Ti) and I'm very keen to leverage the new features and performance benefits of FlashAttention-2 on these cards as soon as possible. Being able to use the V2 version is quite important for my work.

Would it be possible for you to share any information regarding the estimated timeline for when FlashAttention-2 might officially support Turing architecture GPUs? Or perhaps if there's any development plan or roadmap related to this that you can share?

Any update or information on this progress would be extremely helpful to me.

Looking forward to your reply. Thank you again for your time and effort on this great project!

wangsen-zy avatar Apr 22 '25 08:04 wangsen-zy

We're not actively working on Turing but there's a version here: https://github.com/Dao-AILab/flash-attention/issues/1533

tridao avatar Apr 22 '25 13:04 tridao

Hi Team, Do you have any tentative release date for this feature?

Jyothirmaikottu avatar Sep 03 '25 01:09 Jyothirmaikottu

No we're not working on Turing.

tridao avatar Sep 03 '25 01:09 tridao