lmdeploy icon indicating copy to clipboard operation
lmdeploy copied to clipboard

Support for Flash Attention 3 for Ampere, Ada, and Hopper in LMDeploy

Open radna0 opened this issue 9 months ago • 2 comments

Flash Attention 3 now works with these platforms, is it easily possible for LMDeploy team to implement this? @lvhan028

https://github.com/Dao-AILab/flash-attention/issues/1049#issuecomment-2695283567

radna0 avatar Mar 08 '25 16:03 radna0

@lzhangzz Hi,Is there a plan to implement FA3 on the turbomind engine? Thanks!

snippetzero avatar Apr 24 '25 10:04 snippetzero

@snippetzero Likely in May.

lzhangzz avatar Apr 24 '25 15:04 lzhangzz