IndicBERT Add Flash Attention 2 Support

Add Flash Attention 2 Support

Open rajveer43 opened this issue 2 years ago • 0 comments

Feature request

Flash Attention 2 is a library that provides attention operation kernels for faster and more memory efficient inference and training: https://github.com/Dao-AILab/flash-attention

269989726-1395f962-26ca-4728-a8d0-085792295c28

Oct 05 '23 06:10 rajveer43

IndicBERT IndicBERT copied to clipboard

Add Flash Attention 2 Support

Feature request

IndicBERT
IndicBERT copied to clipboard