gpt-neox block-sparse flash attention support

block-sparse flash attention support

Open jordiclive opened this issue 1 year ago • 3 comments

I saw flash attention was recently merged.

This approximate attention would be cool to have as well for training very large sequence lengths. https://github.com/HazyResearch/flash-attention/blob/main/flash_attn/flash_blocksparse_attention.py

Mar 22 '23 20:03 jordiclive

Hello, I am new to the EleutherAI team and i think this would be a good issue to try to solve. May i be assigned this task please.

May 08 '23 17:05 natek-1

@natek-1 Welcome! Thank you for your contribution.

May 08 '23 18:05 StellaAthena

Hey @natek-1 , do you have any updates on this? It's totally alright if you haven't gotten a chance to look at it. Would it be alright if we assigned it to someone else?

Sep 25 '23 14:09 dashstander

gpt-neox gpt-neox copied to clipboard

block-sparse flash attention support

gpt-neox
gpt-neox copied to clipboard