DNABERT_2 icon indicating copy to clipboard operation
DNABERT_2 copied to clipboard

[Feature request] incorporation of flash-attention 2

Open yang-dongxu opened this issue 2 years ago • 3 comments

Thank you for your exceptional work! Your model outperforms others with its efficiency in terms of parameters and speed. We have noticed that flash-attention has recently released version 2, which has greatly improved computation speed. We kindly request the incorporation of this update as it is urgently needed. Once again, thank you for your hard work! :)

yang-dongxu avatar Aug 09 '23 11:08 yang-dongxu

I've encountered a similar issue. I believe the recently released flash-attention version 2 significantly improves the execution time for DNA-BERT2. By the way, I greatly appreciate your work; it has been very helpful to me.

zhaoweiyu-github avatar Aug 09 '23 12:08 zhaoweiyu-github

Hey,

Thank you very much for your interest in our work and for this great suggestion!

However, I can't do it right away since I have been quite busy recently. But you are very welcome to submit PRs if you find a good way to do it. I will work on it after a few weeks.

Zhihan1996 avatar Aug 09 '23 22:08 Zhihan1996

I think this might be harder than one would expect, since the current triton implementation (only one that allows for AliBi) is pretty well known to be broken, unless you have a specific dev version of triton. There's a lot of discussion on the flash repo pertaining to this. If someone really wants to give it a go, this might be promising to use instead. But currently, implementation is a bit more involved than just switching out a few lines on DNABERT_2.

immanuelazn avatar Dec 11 '23 06:12 immanuelazn