candle Unable to build candle with flash attention on iOS

Unable to build candle with flash attention on iOS

Open jpchen opened this issue 4 months ago • 0 comments

When I try to build and run a llama 3.2 1b model on iOS (iPhone 14) with flash attention on Metal, I get ``/Users/jpchen/.cargo/git/checkouts/candle-6740f55d69a3bf41/b4ec636/candle-transformers/src/models/llama.rs:254:5: not implemented: compile with '--features flash-attn'`

A little unfamiliar with Candle - I see that flash attention is supported for Metal hardware, and I was curious if this is an ios specific thing or if theres a way I could build it to get flash attention support? Thanks.

Jul 06 '25 07:07 jpchen

candle candle copied to clipboard

Unable to build candle with flash attention on iOS

candle
candle copied to clipboard