ThunderKittens
ThunderKittens copied to clipboard
attn_bias rel-pos support to the FAv2 example
AFAIK https://github.com/Dao-AILab/flash-attention/ did not have the bandwidth to support custom attn_bias (needed for relpos) - I think it's supported for the Triton version there, but I saw reports that it's not very stable / performant in backward.
Should it be relatively easy to add in the attention example based on TK?