CLEAR icon indicating copy to clipboard operation
CLEAR copied to clipboard

Speedup should be much more than 6.3x on 8K resolution?

Open sjtuzq opened this issue 10 months ago • 1 comments

Hi, Thanks for your great work! I have one question, from table 7, for the 8K resolution, the TFLOPS reduced from 847.73 to 3.92, but why the overall speedup is only from 1842.48 to 293.50? Does the flexattention here become the bottleneck?

sjtuzq avatar Jan 29 '25 05:01 sjtuzq

Thanks for your question!

Exactly. Currently the reason that the practical acceleration falls behind the theoretical results is on the difficulty in implementation.

Huage001 avatar Feb 12 '25 10:02 Huage001