alpaca-electron Performance is much worse than in llama.cpp on Apple Silicon (M1/M2)

Performance is much worse than in llama.cpp on Apple Silicon (M1/M2)

Open dogjamboree opened this issue 1 year ago • 10 comments

Inference in general is much slower than with llama.cpp on my M2 Pro processor using any model (I tried Vicuna 13b, llama 13b, and alpaca 13b).

I would say approximately 3x slower, all things being equal. I don't have exact timings but it's obvious without counting the seconds.

Apr 14 '23 00:04 dogjamboree

I think the MacOS binaries were built just a day before the Apple ARM64 chip optimization was merged into llama.cpp. If you want, you can send me the new llama.cpp binaries (just main is ok) and I can update the app.

Apr 14 '23 01:04 ItsPi3141

i just sent you the files on discord! :)

Apr 14 '23 18:04 Gitterman69

Awesome -- I was having trouble figuring out how to attach here, haha.

Apr 14 '23 19:04 dogjamboree

i just sent you the files on discord! :)

Thank you.

Apr 14 '23 21:04 ItsPi3141

Are you able to get it working at all with the M2 Pro? Mine just permanently hangs after prompting.

Apr 14 '23 23:04 phalladar

Yes, I too have an M2 Pro (32gb ram) and it works but just very very slowly. Even 30b loaded but I mostly tried 13b.

Apr 15 '23 00:04 dogjamboree

Interesting. I have a M2 Max with 64gb and I can't even get the 7b to work.

Apr 15 '23 15:04 phalladar

I am using Mac Mini M2 32 GB, the 7B model is the best so far and pretty smart. The 30B is too slow.

I wish it could be at the level of ChatGPT that can generate Blender Python.

Apr 17 '23 04:04 enzyme69

Ok everyone, i just updated the v1.0.5 macOS ARM64 release. Try downloading and installing it again. It should be faster now.

Apr 17 '23 05:04 ItsPi3141

Works amazing now!

Sent from Proton Mail for iOS

Apr 17 '23 05:04 Gitterman69

Works great! Thanks for the change. Only problem now is

bash: Would: command not found
bash-3.2$

After clearing chat, but I'm assuming that's unrelated to the arm64

Apr 20 '23 22:04 phalladar

Works great! Thanks for the change. Only problem now is
bash: Would: command not found
bash-3.2$ 
After clearing chat, but I'm assuming that's unrelated to the arm64

Yeah that's unrelated to arm64 MacOS build. It seems in the new llama.cpp, something happened and ^C also exits out of llama.cpp on Linux and MacOS. I didn't add code to check for it on MacOS and Linux.

Apr 20 '23 22:04 ItsPi3141

alpaca-electron alpaca-electron copied to clipboard

Performance is much worse than in llama.cpp on Apple Silicon (M1/M2)

alpaca-electron
alpaca-electron copied to clipboard