llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

Metal issues with Intel Mac

Open zhangchn opened this issue 2 years ago • 5 comments
trafficstars

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • [x] I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • [x] I carefully followed the README.md.
  • [x] I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • [x] I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

Run server properly.

Current Behavior

After launching the command, macOS GUI frozen.

./server -m path-to-gguf/some-gguf-previously-worked

Environment and Context

Please provide detailed information about your computer setup. This is important in case the issue is not reproducible except for under certain specific conditions.

  • Physical (or virtual) hardware you are using, e.g. for Linux: macbook pro 2022 $ lscpu 4 core intel core i5

  • Operating System, e.g. for Linux: macOS 13.5.1 $ uname -a

  • SDK version, e.g. for Linux: Xcode 15.0.1 15A507

$ python3 --version
$ make --version
$ g++ --version

Failure Information (for bugs)

The GUI froze after loading metal file.

Steps to Reproduce

Please provide detailed steps for reproducing the issue. We are not sitting in front of your screen, so the more detail the better.

  1. build from git source
  2. run ./server -m path-to-gguf The screen output suggests: ggml_metal_init: found device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: picking default device: Intel(R) Iris(TM) Plus Graphics ggml_metal_init: default.metallib not found, loading from source ggml_metal_init: loading '....../llama.cpp/ggml-metal.metal'

Then the GUI froze. After about 2 minutes, the watchdog was triggered and the OS panic occurred.

zhangchn avatar Nov 09 '23 10:11 zhangchn

Tested with a smaller weight, e.g. llama-2-7b-chat.Q4_0.gguf from TheBloke/llama2-7b-chat-gguf.

With -ngl 0 and no metal acceleration, server output was normal and reasonable. With -ngl 1 or higher, the output would be abnormal:

User: hello
Llama: hew$omm friosingahnattle Chefattrtcipeahn stackozommotrлю stack numerura Matth Fri friowyazi Stanis Esp stack stackahn Mos Basicovi immugenemblyabormosaga stackommtags attackuraahnovyommahnahnowyelsk Fra Chetags ub Grundfileowy Domwohlprevioustagsowyahnotrowy openahnéraahn Cheuraватиahnfattcipeosing stacklem Culturartahnahnahn numer Fland Matthmodel Stanisazi Fland friowy Stanis Fland Stanisowyazi Stanis$tags Stanis Flandabor Mos stack Stanisattle rag Espowyomm Stanisaga Cheelsk Fri model stack Matthaziovi Basicugen Stanisowyembly stackowy dispotr Matth Modelahnhagenhewії stackлю Fland stack宝owy Staniscipeazi Stanis Cheahnosing Stanis numer Matthfattrt Fland Che Stanisowy Cheijkuratags Matth Che Stanisowyazi Mos Culturaowy Cheowytags$ Basicabor Esptags Stanis ub Comics Cheelsk Eb Che Che openommowy Cheaga Che savotrtagstagséquipe Che mort Matthprevious Alliancehagenahn Cheague Stanis stack model Che ind Matthosing Che rag Cheлюcipefatt Flandugentagsovyovi Stanis Stanis Chert Fri Stanisazimodelijk Stanis Stanisosing看$ Cheowy Matth Stanisowy Cheowyabor Mos Che Che Che Matth Che Stanis Matth Stanis$ Cheowy Che Espіїahn Stanis$ Stanis ub Stanisotr宝 Modelopen stackabor Basic· Culturaelsk Dom numer fri Stanis ragowy Eb Fland fre Stanisemblyovi Fland rfattrt Che model Stanis Che Che Che Stanis Fland Che Stanisazi Cheosingcipe mort Fland Chetags Stanis Fri proced Chelem FlandowyCheowyrt Che Matth Stanis Espahn Cheazi Che Mosijk Che Stanisarian Fland Che Fland stackopen ubahn宝 open Stanisahn Chetagsrt numer Basicowyowy Cheaborлюahn numerahnowy Stanis numerelsk rag Stanisotr Che*$ Che Che le Stanis Che Cheowy freahn Che宝ahn Che Fland mort Stanis Eb Basic Basic Friahn Flandowyместosing Matth Esp Domtags

zhangchn avatar Nov 09 '23 17:11 zhangchn

I have the same issue.

MikeLP avatar Dec 16 '23 09:12 MikeLP

Tested on 6800XT or RX 580 on my Intel Mac and I have the same issue.

Basten7 avatar Dec 18 '23 07:12 Basten7

I also have the same issue, and i've tried OpenCL mode and get gibberish for anything over ngl=1. My setup is Intel mac with AMD GPU.

devYonz avatar Jan 18 '24 01:01 devYonz

I have the same issue with a 2017 13-inch MBP (2,3 GHz Dual-Core Intel Core i5, Intel Iris Plus Graphics 640 1536 MB, 8 GB 2133 MHz LPDDR3, macOS Ventura 13.6.4).

Using a smaller quantisation (Q2_K), instead of a UI freeze I get an endless string of ggml_metal_graph_compute: command buffer 0 failed with status 5 errors. According to what someone said in this Reddit comment, and unless I'm mistaken, this might be a memory issue. My recommendedMaxWorkingSetSize is 1610.61 MB, which means even 1 layer of the smallest 7B quantisation is not going to fit on my computer using Metal.

tjohnman avatar Feb 19 '24 08:02 tjohnman

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Apr 04 '24 01:04 github-actions[bot]