llama.cpp
llama.cpp copied to clipboard
Too slow on m2 MBA 16gb SSD 512GB
Hi,
First of all, thanks for the tremendous work!
I just wanted to ask that compared to your demo, when I run the same input sentence, the speed difference is tremendously different. Is this because of the chipset difference between m1 pro and m2 or, you already knew this issue and trying to fix this?
How much slower? Post the stats from the end of the run with the model used
Also post the command-line you are using
try ./main -h
find -t
args means number of threads to use
, which would make it really fast, like 0.2s per token.
No reaction from the reporter. Please reopen if the issue still persists.