LocalAI icon indicating copy to clipboard operation
LocalAI copied to clipboard

Build fails to include llama-cpp backend on Apple Silicon (arm64)

Open FALK-BRAUER opened this issue 4 months ago • 2 comments

LocalAI version: Built from master branch. The issue was also confirmed on pre-built official Docker images: localai/localai:latest and localai/localai:v3.4.0.

Environment, CPU architecture, OS, and Version:

  • OS: macOS
  • CPU Architecture: Apple Silicon (M-series, arm64)
  • Environment: Docker Desktop for Mac

Describe the bug The Docker build process for the arm64 architecture fails to include the llama-cpp backend, even when explicitly instructed to do so with build arguments. This makes it impossible to run GGUF models on Apple Silicon Macs, as all attempts result in a backend not found: llama-cpp error at runtime.

To Reproduce

  1. On an Apple Silicon Mac, clone the repository:

    git clone [https://github.com/mudler/LocalAI.git](https://github.com/mudler/LocalAI.git)
    
  2. Navigate into the directory:

    cd LocalAI
    
  3. Build the Docker image with the explicit build flag to include the backend:

    docker build --build-arg GO_TAGS=llama-cpp -t localai-custom .
    
  4. Create a models directory and place a GGUF file inside (e.g., tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf).

  5. Create a models.yaml file in the LocalAI directory with the following content:

    - name: tinyllama
      backend: llama-cpp
      parameters:
        model: tinyllama-1.1b-chat-v1.0.Q4_K_M.gguf
      context_size: 2048
    
  6. Run the custom-built container:

    docker run -d --name localai -p 8080:8080 -v $(pwd)/models:/models -v $(pwd)/models.yaml:/models.yaml localai-custom --config-file /models.yaml
    
  7. Make an API request to the model:

    curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "tinyllama", "messages": [{"role": "user", "content": "Hello"}]}'
    

Expected behavior The API should return a valid JSON response from the tinyllama model, indicating that the llama-cpp backend was found and successfully loaded the model.

Logs The API call consistently fails with the following JSON error response:

{"error":{"code":500,"message":"failed to load model with internal loader: backend not found: llama-cpp","type":""}}

FALK-BRAUER avatar Aug 19 '25 16:08 FALK-BRAUER

Builds don't include backend anymore - however it should have pulled automatically the backend from the gallery. Please share the full logs.

mudler avatar Aug 20 '25 15:08 mudler

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] avatar Nov 19 '25 02:11 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

github-actions[bot] avatar Nov 24 '25 02:11 github-actions[bot]