flash-attention icon indicating copy to clipboard operation
flash-attention copied to clipboard

Apple Silicon Support

Open chigkim opened this issue 1 year ago • 4 comments

More and more models started using flash attention which is awesome. However, it's not available for Apple silicon. Can we have flash attention for Apple Silicon on pip? Thanks so much!

chigkim avatar Jun 03 '24 13:06 chigkim

I personally have no bandwidth for that, so we'd need folks to contribute.

tridao avatar Jun 03 '24 16:06 tridao

how / where would one start with that @tridao ?

Reza2kn avatar Dec 22 '24 01:12 Reza2kn

Someone with access to Manus AI or to the Pro version of ChatGPT or Delvin could give it a try to do the port using some agent. Definitely a good test for an agent.

Jerry-Master avatar Mar 16 '25 16:03 Jerry-Master

There seems to be metal flash attn support already, however its in swift. https://github.com/philipturner/metal-flash-attention

harvestingmoon avatar Oct 04 '25 13:10 harvestingmoon