LLaMA_MPS icon indicating copy to clipboard operation
LLaMA_MPS copied to clipboard

Support Apple Neural Engine (ANE) Transformers

Open LeiHao0 opened this issue 1 year ago • 1 comments

I noticed Apple supports ANE Transformers.

According to their own words:

M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory

Does that mean running 30B or 65B will be possible on small-memory MacBooks?

Here are a few links https://github.com/apple/ml-ane-transformers https://machinelearning.apple.com/research/neural-engine-transformers

As this project is the top LLaMA that leverages Apple GPU, is it possible to support ANE too?

LeiHao0 avatar Mar 25 '23 00:03 LeiHao0

I don't know whether that would provide much speedup for current LLM architectures, which are memory bound. Rather, it might be useful for Stable Diffusion (compute-bound) or MegaByte transformers.

philipturner avatar May 27 '23 12:05 philipturner