Flash-Attention-Softmax-N
Flash-Attention-Softmax-N copied to clipboard

softmax1

→

Metadata

CUDA and Triton implementations of Flash Attention with SoftmaxN.

Reame
Issues

Results 2 Flash-Attention-Softmax-N issues

Sort by recently updated

No support attn_mask != None in flash_attention_n

Thanks for your nick work first! But when I use the flash_attention_n, I found a bug which happened in setting attn_mask from None to attention_mask. How can I fix it?...

PeiqinSun

added unit tests for `attn_mask`

I added unit tests for the case when the `attn_mask` arugment of `flash_attention_n` or `slow_attention_n` is not None in an attempt to reproduce #39. My unit tests pass on my...

christopher-w-murphy

About

CUDA and Triton implementations of Flash Attention with SoftmaxN.

pytorch

artificial-intelligence

deep-learning

attention-mechanism

transformers

64

Stars

4

Forks

Watchers

Owner

softmax1

← Metadata

64

Stars

4

Forks

Watchers

Owner

softmax1

Metadata

CUDA and Triton implementations of Flash Attention with SoftmaxN.

Back

Flash-Attention-Softmax-N Flash-Attention-Softmax-N copied to clipboard

Metadata

No support attn_mask != None in flash_attention_n

added unit tests for `attn_mask`

← Metadata

Owner

Metadata

Flash-Attention-Softmax-N
Flash-Attention-Softmax-N copied to clipboard