llama_cpp_dart icon indicating copy to clipboard operation
llama_cpp_dart copied to clipboard

Is this library working on IOS

Open FahriBilici opened this issue 1 year ago • 5 comments

I used code similar to test_isolated.dart. There was no error when loading the model, but when I sent a prompt to the model, there was no response, even after waiting for 10 minutes. My phone is a 16 Pro, and the model I tried is Gemma-2-2B.

FahriBilici avatar Jan 12 '25 00:01 FahriBilici

Here are my steps for macOS those might help for iOS too:

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
mkdir build
cd build
git checkout b4138
cmake .. \
  -DLLAMA_FATAL_WARNINGS=ON \
  -DLLAMA_CURL=ON \
  -DGGML_RPC=ON \
  -DBUILD_SHARED_LIBS=ON \
  -DLLAMA_NATIVE=ON
cmake --build . --config Release -j $(nproc)
cp src/libllama.so ../..

Also keep in mind to set in your code Llama.libraryPath = 'libllama.dylib';.

rekire avatar Jan 26 '25 13:01 rekire

I also have failed to run the model on my A12 Bionic. The closest I could get is with the following build command, which then leads to DartWorker (29): EXC_BAD_ACCESS (code=1, address=0x94)

cmake -DCMAKE_SYSTEM_NAME=iOS \
      -DCMAKE_OSX_ARCHITECTURES=arm64 \
      -DCMAKE_BUILD_TYPE=Release \
      -DLLAMA_BUILD_TESTS=OFF \
      -DLLAMA_BUILD_EXAMPLES=OFF \
      -DGGML_METAL=OFF \
      -DSIMD_SUM_F32_DISABLED=ON \
      -S . \
      -B build

However, I do get logs from the model, such as

llama_model_loader: loaded meta data with 77 key-value pairs and 148 tensors from /var/containers/Bundle/Application/B8547266-2742-4522-9879-92F1B0B03CD6/Runner.app/Frameworks/Dolphin 3.0 Llama 3.2 1B GGUF.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
.....

maksymmatviievskyi avatar Jan 28 '25 22:01 maksymmatviievskyi

This, perhaps, may not be related to the issue, but having had difficulties with this library, I iterated over all other methods to run the local inference with Flutter, and after a while, I managed to use Flutter channels and llama.swift example. You have to dig into native code a bit, but I am happy to share my workaround if anyone is interested.

maksymmatviievskyi avatar Feb 09 '25 16:02 maksymmatviievskyi

@netdur what about to ship the dependencies? I checked the latest commits you still don't really document which version is exactly required to make it working

I'm working on something else, but with the same problem. I came across this article which was a bit enlightening for me: https://dev.to/leehack/how-to-use-golang-in-flutter-application-golang-ffi-1950

Most interesting is this part here:

The above script will create three static libraries. Before creating the xcframework, we must combine the two static libraries for the simulator architecture. The lipo command can combine the two static libraries.

lipo \
-create \
libsum_arm64_iphonesimulator.a \
libsum_amd64_iphonesimulator.a \
-output libsum_iphonesimulator.a

Now we can create the xcframework using the below command.

xcodebuild -create-xcframework \
-output ../ios/libsum.xcframework \
-library ios-arm64/libsum.a \
-headers ios-arm64/libsum.h \
-library ios-simulator/libsum.a \
-headers ios-simulator/libsum.h

With those commands you can combine the different architectures to a single binary for the emulator and to ship everything you need to build such a framework which you can generate with a shell command.

rekire avatar Feb 16 '25 09:02 rekire

you check dependency here https://github.com/netdur/llama_cpp_dart/tree/main/src

clone the repo and then run darwin/run_build.sh to build llama and place binaries in bin folder

also darwin//run fix_rpath to fix binaries rpath and signing

if you look at those script, they can easily be modified to build for ios

netdur avatar Feb 17 '25 13:02 netdur