flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

flash decoding algorithm numerical error

Open hanzz2007 opened this issue 9 months ago • 2 comments

In combine_attn_seqk_parallel, didn't calulate the global maximum score m and properly rescale O_i , so might have more numerical error than v1 and v2

hanzz2007 avatar May 14 '24 02:05 hanzz2007

@tridao

hanzz2007 avatar May 14 '24 03:05 hanzz2007

Can you give a short script showing the numerical error?

tridao avatar May 14 '24 04:05 tridao