ring-flash-attention icon indicating copy to clipboard operation
ring-flash-attention copied to clipboard

精度问题

Open hxdtest opened this issue 11 months ago • 1 comments

There are some arithmetic errors with the current implementation. The reason for them is probably that flash attention will return bf16 value for each block, so we cannot accumluate the values with the original fp32 ones.

如果使用bf16精度,不是fp32精度,就不存在accumluate the values with the original fp32 ones. ?

hxdtest avatar Mar 06 '24 12:03 hxdtest