LocalAI LLama backend is broken

LocalAI version: latest-aio-gpu-nvidia-cuda-12

Environment, CPU architecture, OS, and Version:

$ uname -a
Linux server 6.1.0-21-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.90-1 (2024-05-03) x86_64 GNU/Linux

$ lsmod | grep nvidia
nvidia_uvm           1540096  0
nvidia_drm             77824  0                      
drm_kms_helper        208896  1 nvidia_drm                                  
nvidia_modeset       1314816  2 nvidia_drm                                  
video                  65536  1 nvidia_modeset                              
nvidia              56778752  19 
nvidia_uvm,nvidia_modeset                  drm                   
614400  4 drm_kms_helper,nvidia,nvidia_drm

docker-compose.yml

Describe the bug

All models with llama.cpp as the backend just don't work

To Reproduce

Replicate my setup
Chat with pre-installed llava from the webui
See nothing in the webui
See weird stuff in logs

Expected behavior

I should've received a response

Logs

Here's localai running from start to finish (with me running llava from webui) localai-log.txt

Additional context

I have wiped /models and ran localai once before recording the log

From what I see, the model successfully loads in llama.cpp, but localai doesn't recognize this and tryes to use a bunch of other backend, untimately arriving at stablediffusion

Aug 07 '24 21:08 unhighghlow

Any updates?

Sep 28 '24 11:09 xxfogs

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Jul 22 '25 02:07 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

Jul 27 '25 02:07 github-actions[bot]

LocalAI LocalAI copied to clipboard

LLama backend is broken

LocalAI
LocalAI copied to clipboard