Flash-Attention-Softmax-N icon indicating copy to clipboard operation
Flash-Attention-Softmax-N copied to clipboard

CUDA and Triton implementations of Flash Attention with SoftmaxN.

Results 2 Flash-Attention-Softmax-N issues
Sort by recently updated
recently updated
newest added

Thanks for your nick work first! But when I use the flash_attention_n, I found a bug which happened in setting attn_mask from None to attention_mask. How can I fix it?...

I added unit tests for the case when the `attn_mask` arugment of `flash_attention_n` or `slow_attention_n` is not None in an attempt to reproduce #39. My unit tests pass on my...