IndicBERT
IndicBERT copied to clipboard
Add Flash Attention 2 Support
Feature request
Flash Attention 2 is a library that provides attention operation kernels for faster and more memory efficient inference and training: https://github.com/Dao-AILab/flash-attention