flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

How Flash attention2 use in Prefix decoder?

Open lonelycrab888 opened this issue 2 years ago • 1 comments

My attention_mask is a dynamic mask matrix for the prefix decoder, similar to UniLM and GLM. How should this type of attention_mask be applied to Flash Attention?

lonelycrab888 avatar Nov 21 '23 07:11 lonelycrab888

That kind of mask is not currently supported.

tridao avatar Apr 18 '24 10:04 tridao