LocalAI [3.0.0] Context local model name not found, setting to default

LocalAI version: 3.0.0

Environment, CPU architecture, OS, and Version: Docker GPU CUDA 12

Describe the bug DBG context local model name not found, setting to default defaultModelName=stablediffusion DreamShaper is one image model that worked with AIO or -extra image

DBG context local model name not found, setting to the first model first model name=bark-cpp-small, TTS doesn't do anything, never starts.

Also notice other models don't work now.

gemma-3-4b-it bunny-llama-3-8b-v LocalAI-functioncall-phi-4-v0.3 deepseek-r1-distill-llama-8b

Backends installed bark-cpp cuda12-bark-development cuda12-diffusers cuda12-kokoro-development cuda12-transformers cuda12-transformers-development

Models installed: stablediffusion stable-diffusion-3-medium bark-cpp-small

To Reproduce chat to a model

Expected behavior chats back

Logs 12:40AM DBG context local model name not found, setting to default defaultModelName=stablediffusion

12:40AM DBG Parameter Config: &{PredictionOptions:{BasicModelRequest:{Model:DreamShaper_8_pruned.safetensors} Language: Translate:false N:0 TopP:0xc0abe51600 TopK:0xc0abe51608 Temperature:0xc0abe51610 Maxtokens:0xc0abe51640 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0abe51638 TypicalP:0xc0abe51630 Seed:0xc0abe51650 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 ClipSkip:0 Tokenizer:} Name:dreamshaper F16:0xc0abe515e9 Threads:0xc0abe515f0 Debug:0xc09cd01660 Roles:map[] Embeddings:0xc0abe51649 Backend:diffusers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_IMAGE FLAG_VIDEO FLAG_ANY] KnownUsecases:<nil> Pipeline:{TTS: LLM: Transcription: VAD:} PromptStrings:[cool pink sports car] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0abe51628 MirostatTAU:0xc0abe51620 Mirostat:0xc0abe51618 NGPULayers:<nil> MMap:0xc0abe51648 MMlock:0xc0abe51649 LowVRAM:0xc0abe51649 Reranking:0xc0abe51649 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0abe51658 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} Diffusers:{CUDA:true PipelineType:StableDiffusionPipeline SchedulerType:k_dpmpp_2m EnableParameters:negative_prompt,num_inference_steps IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:25 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[]}

Additional context

Jun 20 '25 01:06 Hello-World-Traveler

@mudler is this my error or an error made my last update? I have to roll back to before 3.0.0 as everything worked before the update. When updated to 3.0.0 and downloaded the back-ends, every model gives me this error.

Jun 26 '25 10:06 Hello-World-Traveler

@Hello-World-Traveler I can't reproduce, and CI runs automated tests against most models. Diffusers works without issues here. Can you maybe send the full logs, and how have you set up the instance?

Jun 26 '25 10:06 mudler

I notice in the logs /build/backend/python/vllm/run.sh at the time I didn't have vllm installed so I installed it but doesn't change anything.

Image: localai/localai:latest-gpu-nvidia-cuda-12 (3.0.0)

Docker, Every update I just change the image, on 3.0.0 I created the back-end model folder and mapped it.

11:26AM DBG context local model name not found, setting to default defaultModelName=stablediffusion

11:26AM DBG Loading model in memory from file: /build/models/DreamShaper_8_pruned.safetensors

11:26AM DBG Loading Model dreamshaper with gRPC (file: /build/models/DreamShaper_8_pruned.safetensors) (backend: diffusers): {backendString:diffusers model:DreamShaper_8_pruned.safetensors modelID:dreamshaper assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc09df46588 externalBackends:map[bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 parallelRequests:false}

11:26AM DBG Loading external backend: /build/backend/python/diffusers/run.sh

11:26AM DBG external backend is file: &{name:run.sh size:191 mode:493 modTime:{wall:0 ext:63885945654 loc:0x468ac20} sys:{Dev:69 Ino:12853245 Nlink:1 Mode:33261 Uid:0 Gid:0 X__pad0:0 Rdev:0 Size:191 Blksize:4096 Blocks:8 Atim:{Sec:1750348854 Nsec:0} Mtim:{Sec:1750348854 Nsec:0} Ctim:{Sec:1750374434 Nsec:382657303} X__unused:[0 0 0]}}

11:26AM DBG Loading GRPC Process: /build/backend/python/diffusers/run.sh

11:26AM DBG GRPC Service for dreamshaper will be running at: '127.0.0.1:36109'

11:26AM DBG GRPC Service state dir: /tmp/go-processmanager2862674854

11:26AM DBG GRPC Service Started

11:26AM DBG Wait for the service to start up

11:26AM DBG Options: ContextSize:1024  Seed:119434318  NBatch:512  F16Memory:true  MMap:true  NGPULayers:9999999  Threads:10  PipelineType:"StableDiffusionPipeline"  SchedulerType:"k_dpmpp_2m"  CUDA:true

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stdout Initializing libbackend for diffusers

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stderr ./../common/libbackend.sh: line 94: uv: command not found

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stdout virtualenv created

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stderr ./../common/libbackend.sh: line 100: /build/backend/python/diffusers/venv/bin/activate: No such file or directory

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stdout virtualenv activated

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stdout activated virtualenv has been ensured

11:26AM DBG GRPC(dreamshaper-127.0.0.1:36109): stderr ./../common/libbackend.sh: line 183: /build/backend/python/diffusers/venv/bin/python: No such file or directory


11:39AM DBG context local model name not found, setting to default defaultModelName=stablediffusion

11:39AM DBG Parameter Config: &{PredictionOptions:{BasicModelRequest:{Model:DreamShaper_8_pruned.safetensors} Language: Translate:false N:0 TopP:0xc0abeb93c0 TopK:0xc0abeb93c8 Temperature:0xc0abeb93d0 Maxtokens:0xc0abeb9400 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0abeb93f8 TypicalP:0xc0abeb93f0 Seed:0xc0abeb9410 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 ClipSkip:0 Tokenizer:} Name:dreamshaper F16:0xc0abeb93a9 Threads:0xc0abeb93b0 Debug:0xc09e6d1440 Roles:map[] Embeddings:0xc0abeb9409 Backend:diffusers TemplateConfig:{Chat: ChatMessage: Completion: Edit: Functions: UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_VIDEO FLAG_ANY FLAG_IMAGE] KnownUsecases:<nil> Pipeline:{TTS: LLM: Transcription: VAD:} PromptStrings:[cool pink sports car] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder: SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[] CaptureLLMResult:[] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0abeb93e8 MirostatTAU:0xc0abeb93e0 Mirostat:0xc0abeb93d8 NGPULayers:<nil> MMap:0xc0abeb9408 MMlock:0xc0abeb9409 LowVRAM:0xc0abeb9409 Reranking:0xc0abeb9409 Grammar: StopWords:[] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0abeb9418 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} Diffusers:{CUDA:true PipelineType:StableDiffusionPipeline SchedulerType:k_dpmpp_2m EnableParameters:negative_prompt,num_inference_steps IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:25 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[]}

11:40AM INF Success ip=127.0.0.1 latency="30.135µs" method=GET status=200 url=/readyz

11:43AM DBG context local model name not found, setting to the first model first model name=suayptalha_maestro-10b

11:43AM DBG guessDefaultsFromFile: NGPULayers set NGPULayers=99999999

11:43AM DBG guessDefaultsFromFile: template already set name=LocalAI-functioncall-phi-4-v0.3

11:43AM DBG Chat endpoint configuration read: &{PredictionOptions:{BasicModelRequest:{Model:localai-functioncall-phi-4-v0.3-q4_k_m.gguf} Language: Translate:false N:0 TopP:0xc0a06a3920 TopK:0xc0a06a3928 Temperature:0xc0a06a3930 Maxtokens:0xc0a06a3960 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0a06a3958 TypicalP:0xc0a06a3950 Seed:0xc0a06a3970 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 ClipSkip:0 Tokenizer:} Name:LocalAI-functioncall-phi-4-v0.3 F16:0xc0a06a38d8 Threads:0xc0a06a3910 Debug:0xc09e964810 Roles:map[] Embeddings:0xc0a06a3969 Backend: TemplateConfig:{Chat:{{.Input}}

<|im_start|>assistant<|im_sep|>

 ChatMessage:<|im_start|>{{ .RoleName }}<|im_sep|>

{{.Content}}<|im_end|>

 Completion:{{.Input}}

 Edit: Functions:<|im_start|>system<|im_sep|>

You are an AI assistant that executes function calls, and these are the tools at your disposal:

{{range .Functions}}

{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}

{{end}}

{{.Input}}<|im_end|>

 UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_COMPLETION FLAG_ANY FLAG_CHAT] KnownUsecases:<nil> Pipeline:{TTS: LLM: Transcription: VAD:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder:name,arguments SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)<Output>(.*?)</Output>] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[{Key:(?s)<Thought>(.*?)</Thought> Value:}] CaptureLLMResult:[(?s)<Thought>(.*?)</Thought>] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0a06a3948 MirostatTAU:0xc0a06a3940 Mirostat:0xc0a06a3938 NGPULayers:0xc0a1bfa738 MMap:0xc0a06a38dc MMlock:0xc0a06a3969 LowVRAM:0xc0a06a3969 Reranking:0xc0a06a3969 Grammar: StopWords:[<|end|> <|endoftext|> <|im_end|>] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0a06a38c8 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[gpu]}

11:43AM DBG Parameters: &{PredictionOptions:{BasicModelRequest:{Model:localai-functioncall-phi-4-v0.3-q4_k_m.gguf} Language: Translate:false N:0 TopP:0xc0a06a3920 TopK:0xc0a06a3928 Temperature:0xc0a06a3930 Maxtokens:0xc0a06a3960 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 RepeatLastN:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0a06a3958 TypicalP:0xc0a06a3950 Seed:0xc0a06a3970 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 ClipSkip:0 Tokenizer:} Name:LocalAI-functioncall-phi-4-v0.3 F16:0xc0a06a38d8 Threads:0xc0a06a3910 Debug:0xc09e964810 Roles:map[] Embeddings:0xc0a06a3969 Backend: TemplateConfig:{Chat:{{.Input}}

<|im_start|>assistant<|im_sep|>

 ChatMessage:<|im_start|>{{ .RoleName }}<|im_sep|>

{{.Content}}<|im_end|>

 Completion:{{.Input}}

 Edit: Functions:<|im_start|>system<|im_sep|>

You are an AI assistant that executes function calls, and these are the tools at your disposal:

{{range .Functions}}

{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}

{{end}}

{{.Input}}<|im_end|>

 UseTokenizerTemplate:false JoinChatMessagesByCharacter:<nil> Multimodal: JinjaTemplate:false ReplyPrefix:} KnownUsecaseStrings:[FLAG_COMPLETION FLAG_ANY FLAG_CHAT] KnownUsecases:<nil> Pipeline:{TTS: LLM: Transcription: VAD:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: ResponseFormat: ResponseFormatMap:map[] FunctionsConfig:{DisableNoAction:false GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:false NoMixedFreeString:false NoGrammar:false Prefix: ExpectStringsAfterJSON:false PropOrder:name,arguments SchemaType: GrammarTriggers:[]} NoActionFunctionName: NoActionDescriptionName: ResponseRegex:[] JSONRegexMatch:[(?s)<Output>(.*?)</Output>] ArgumentRegex:[] ArgumentRegexKey: ArgumentRegexValue: ReplaceFunctionResults:[] ReplaceLLMResult:[{Key:(?s)<Thought>(.*?)</Thought> Value:}] CaptureLLMResult:[(?s)<Thought>(.*?)</Thought>] FunctionNameKey: FunctionArgumentsKey:} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0a06a3948 MirostatTAU:0xc0a06a3940 Mirostat:0xc0a06a3938 NGPULayers:0xc0a1bfa738 MMap:0xc0a06a38dc MMlock:0xc0a06a3969 LowVRAM:0xc0a06a3969 Reranking:0xc0a06a3969 Grammar: StopWords:[<|end|> <|endoftext|> <|im_end|>] Cutstrings:[] ExtractRegex:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0a06a38c8 NUMA:false LoraAdapter: LoraBase: LoraAdapters:[] LoraScales:[] LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: LoadFormat: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 DisableLogStatus:false DType: LimitMMPerPrompt:{LimitImagePerPrompt:0 LimitVideoPerPrompt:0 LimitAudioPerPrompt:0} MMProj: FlashAttention:false NoKVOffloading:false CacheTypeK: CacheTypeV: RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 CFGScale:0} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} TTSConfig:{Voice: AudioPath:} CUDA:false DownloadFiles:[] Description: Usage: Options:[gpu]}

11:43AM DBG templated message for chat: <|im_start|>user<|im_sep|>

11:43AM DBG Stream request received

11:43AM INF Success ip= latency=184.712763ms method=POST status=200 url=/v1/chat/completions

11:43AM DBG Sending chunk: {"created":1750938235,"object":"chat.completion.chunk","id":"fa7b009c-0530-46f7-9f2d-7f8017ed7306","model":"LocalAI-functioncall-phi-4-v0.3","choices":[{"index":0,"finish_reason":"","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}

11:48AM DBG context local model name not found, setting to the first model first model name=LocalAI-functioncall-phi-4-v0.3

11:48AM DBG guessDefaultsFromFile: NGPULayers set NGPULayers=99999999

11:48AM DBG guessDefaultsFromFile: template already set name=qwen2.5-1.5b-instruct

This the best I can do with the logs.

Jun 26 '25 11:06 Hello-World-Traveler

Also: Installation Error failed to extract image : unexpected EOF with all back-end models, is this anything to worry about? seems to be installed.

Jun 26 '25 12:06 Hello-World-Traveler

Also: Installation Error failed to extract image : unexpected EOF with all back-end models, is this anything to worry about? seems to be installed.

that probably indicates that backends didn't installed successfully. Are you running it in docker? can you try to clean your backend folder, try install fresh and come back with the logs?

Jun 26 '25 14:06 mudler

2:25PM INF [silero-vad] Fails: failed to load model with internal loader: could not load model: rpc error: code = Unknown desc = create silero detector: failed to create session: Load model from /models/gemma-3n-E4B-it-Q8_0.gguf failed:Protobuf parsing failed.

2:25PM INF [stablediffusion-ggml] Attempting to load

2:25PM INF BackendLoader starting backend=stablediffusion-ggml modelID=gemma-3n-e4b-it o.model=gemma-3n-E4B-it-Q8_0.gguf

2:25PM DBG Loading model in memory from file: /models/gemma-3n-E4B-it-Q8_0.gguf

2:25PM DBG Loading Model gemma-3n-e4b-it with gRPC (file: /models/gemma-3n-E4B-it-Q8_0.gguf) (backend: stablediffusion-ggml): {backendString:stablediffusion-ggml model:gemma-3n-E4B-it-Q8_0.gguf modelID:gemma-3n-e4b-it assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0003f2b08 externalBackends:map[bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 parallelRequests:false}

2:25PM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/stablediffusion-ggml

[/build/backend/python/coqui/run.sh] Fails: failed to load model with internal loader: backend not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/coqui/run.sh

2:25PM INF [/build/backend/python/kokoro/run.sh] Attempting to load

2:25PM DBG Loading model in memory from file: /models/gemma-3n-E4B-it-Q8_0.gguf

2:25PM DBG Loading Model gemma-3n-e4b-it with gRPC (file: /models/gemma-3n-E4B-it-Q8_0.gguf) (backend: /build/backend/python/kokoro/run.sh): {backendString:/build/backend/python/kokoro/run.sh model:gemma-3n-E4B-it-Q8_0.gguf modelID:gemma-3n-e4b-it assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc0003f2b08 externalBackends:map[bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama2:/build/backend/python/exllama2/run.sh faster-whisper:/build/backend/python/faster-whisper/run.sh kokoro:/build/backend/python/kokoro/run.sh rerankers:/build/backend/python/rerankers/run.sh transformers:/build/backend/python/transformers/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 parallelRequests:false}

2:25PM INF [/build/backend/python/kokoro/run.sh] Fails: failed to load model with internal loader: backend not found: /tmp/localai/backend_data/backend-assets/grpc/build/backend/python/kokoro/run.sh

Working folder is set to / and /models, /backends are mapped, The backend models are not fully installing as you said. gemma-3n-E4B installed without issue.

Is there another way of installing the backends?

Jun 27 '25 14:06 Hello-World-Traveler

Came here to confirm the message.

I had an older version of LocalAI but cannot remember which, from April 2025. Everything was working perfectly right from the original install until the hard drive crash. Installed the current version 3.3.2 using the online installer (not Docker), and suddenly no backends were included, and local-ai just leaves empty backends and configuration folders in every directory it is run from. Even though all missing parts seem to be loading OK, every response starts with this message:

DBG context local model name not found, setting to the first model first model name=mistral-small-24b-instruct-2501

Installed backends and models for reference:

# tree /usr/share/local-ai/
/usr/share/local-ai/
├── backends
│   ├── cuda12-llama-cpp
│   │   ├── lib
│   │   │   ├── ld.so
│   │   │   ├── libc.so.6
│   │   │   ├── libgcc_s.so.1
│   │   │   ├── libgomp.so.1
│   │   │   ├── libm.so.6
│   │   │   └── libstdc++.so.6
│   │   ├── llama-cpp-avx
│   │   ├── llama-cpp-avx2
│   │   ├── llama-cpp-avx512
│   │   ├── llama-cpp-fallback
│   │   ├── llama-cpp-grpc
│   │   ├── llama-cpp-rpc-server
│   │   ├── metadata.json
│   │   └── run.sh
│   └── localai@llama-cpp
│       └── metadata.json
├── configuration
└── models
    ├── Mistral-Small-24B-Instruct-2501-Q4_K_M.gguf
    └── mistral-small-24b-instruct-2501.yaml

[Edited to remove speculation on the cause.]

Aug 05 '25 22:08 mkovalen

@mudler 3.5.0 also has this issue and because of this, it does not try and open a backend.

Sep 04 '25 07:09 Hello-World-Traveler

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Dec 07 '25 02:12 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

Dec 13 '25 02:12 github-actions[bot]