flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Flash Attention 3 fp8 support 4090?

Open huanpengchu opened this issue 1 year ago • 1 comments
trafficstars

My device is a 4090 in hopper architecture, consistent with the h100 architecture. But on the homepage it says “Requirements: H100 / H800 GPU, CUDA >= 12.3.”

I would like to know if flash attention supports the 4090.

huanpengchu avatar Aug 02 '24 09:08 huanpengchu

4090 is Ada, not Hopper

tridao avatar Aug 02 '24 11:08 tridao

@tridao I wonder if it's possible to support 5090?

han508 avatar Jan 22 '25 05:01 han508

@han508 5090 is BlackWell. I need Flash Attention 3 FP4 BlackWell support and soon! Or I'll have to write the code myself which will be time-consuming. I'm starting the training runs in April.

hg0428 avatar Mar 11 '25 01:03 hg0428

@hg0428 can you offer insight into why you need FA FP4 support? Have you tested the accuracy?

mnicely avatar Mar 11 '25 20:03 mnicely

@hg0428 can you offer insight into why you need FA FP4 support? Have you tested the accuracy?

Give this a good read: https://arxiv.org/pdf/2501.17116

hg0428 avatar Mar 11 '25 21:03 hg0428