ring-flash-attention icon indicating copy to clipboard operation
ring-flash-attention copied to clipboard

large memory usage

Open LzhinFdu opened this issue 11 months ago • 5 comments

image Thanks for sharing this excellent implementation of ring attention. Here are my test results on 2*A100 (with nvlink). Judging from the results, the memory usage of ring attention(ring_flash_attn_qkvpacked_func) seems to be very large. This is not as expected. Are there any possible problems?

LzhinFdu avatar Mar 19 '24 08:03 LzhinFdu