flash-attention Apple Silicon Support

Apple Silicon Support

Open chigkim opened this issue 1 year ago • 4 comments

More and more models started using flash attention which is awesome. However, it's not available for Apple silicon. Can we have flash attention for Apple Silicon on pip? Thanks so much!

Jun 03 '24 13:06 chigkim

I personally have no bandwidth for that, so we'd need folks to contribute.

Jun 03 '24 16:06 tridao

how / where would one start with that @tridao ?

Dec 22 '24 01:12 Reza2kn

Someone with access to Manus AI or to the Pro version of ChatGPT or Delvin could give it a try to do the port using some agent. Definitely a good test for an agent.

Mar 16 '25 16:03 Jerry-Master

There seems to be metal flash attn support already, however its in swift. https://github.com/philipturner/metal-flash-attention

Oct 04 '25 13:10 harvestingmoon

flash-attention flash-attention copied to clipboard

Apple Silicon Support

flash-attention
flash-attention copied to clipboard