[cuda-12-diffusers] Fails: failed to load model with internal loader: could not load model. Model is not a local folder and is not a valid model identifier listed
LocalAI version: v3.4.0 (b2e8b6d1aa652b6a95828fe91271e5b686fffa7f)
Environment, CPU architecture, OS, and Version: docker
Describe the bug I think local AGI is trying to run a task.
To Reproduce Not sure.
Expected behavior Start the backend to run model
Logs
3:01AM INF [cuda-12-diffusers] Fails: failed to load model with internal loader: could not load model (no success): Unexpected err=OSError("Qwen2.5-1.5B-Instruct-Q4_K_M.gguf is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'\nIf this is a private repository, make sure to pass a token having permission to this repo with tokenor log in withhuggingface-cli login."), type(err)=<class 'OSError'>
Additional context Local AI download the model weeks ago.
1:58PM INF BackendLoader starting backend=llama-cpp modelID=Llama3.1-8B o.model=Llama3.1-8B-Chinese-Chat.Q8_0.gguf
1:58PM DBG Loading model in memory from file: /models/Llama3.1-8B-Chinese-Chat.Q8_0.gguf
1:58PM DBG Loading Model Llama3.1-8B with gRPC (file: /models/Llama3.1-8B-Chinese-Chat.Q8_0.gguf) (backend: llama-cpp): {backendString:llama-cpp model:Llama3.1-8B-Chinese-Chat.Q8_0.gguf modelID:Llama3.1-8B context:{emptyCtx:{}} gRPCOptions:0xc000695088 externalBackends:map[] grpcAttempts:20 grpcAttemptsDelay:2 parallelRequests:false}
1:58PM ERR Server error error="failed to load model with internal loader: backend not found: llama-cpp" ip=172.18.0.1 latency=1m28.738332289s method=POST status=500 url=/v1/chat/completions
1:59PM INF Success ip=127.0.0.1 latency="24.871µs" method=GET status=200 url=/readyz
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.