I have macbook M1 pro 16GB
I used the following command :
docker run -p 8080:8080 --name local-ai -ti localai/localai:latest-aio-cpu
and went on the frontend to try to chat or to generate an image but nothing work, same issue using the curl, you can see the below logs. I'm willing to try to debug or do some tests but i'm a bit new in this world and lost so not sure where i should look, but don't hesitate to give me the direction.
curl http://localhost:8080/v1/models
{"object":"list","data":[{"id":"gpt-4","object":"model"},{"id":"gpt-4-vision-preview","object":"model"},{"id":"jina-reranker-v1-base-en","object":"model"},{"id":"stablediffusion","object":"model"},{"id":"text-embedding-ada-002","object":"model"},{"id":"tts-1","object":"model"},{"id":"whisper-1","object":"model"},{"id":"MODEL_CARD","object":"model"},{"id":"bakllava-mmproj.gguf","object":"model"},{"id":"voice-en-us-amy-low.tar.gz","object":"model"}]}%
curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json"
-d '{ "model": "gpt-4", "messages": [{"role": "user", "content": "How are you doing?", "temperature": 0.1}] }'
{"error":{"code":500,"message":"rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:45337: connect: connection refused"","type":""}}
curl http://localhost:8080/v1/images/generations \
-H "Content-Type: application/json" -d '{
"prompt": "A cute baby sea otter",
"size": "256x256"
}'
{"error":{"code":500,"message":"rpc error: code = Unavailable desc = error reading from server: EOF","type":""}}
I tried with the docker-compose file, and it popped more logs i guess because of the debug flag, you can see below (not sure it will help it's really a lot) :
api-1 | 9:28AM INF Success ip=172.18.0.1 latency=2.93675ms method=GET status=200 url=/chat/
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="149.25µs" method=GET status=200 url=/static/assets/highlightjs.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="105.541µs" method=GET status=200 url=/static/assets/highlightjs.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="207.834µs" method=GET status=200 url=/static/general.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="563.292µs" method=GET status=200 url=/static/assets/font2.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="123.5µs" method=GET status=200 url=/static/assets/font1.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="156.666µs" method=GET status=200 url=/static/assets/fontawesome/css/brands.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="44.209µs" method=GET status=200 url=/static/assets/fontawesome/css/solid.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="53µs" method=GET status=200 url=/static/assets/htmx.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="97.917µs" method=GET status=200 url=/static/assets/tailwindcss.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="66.083µs" method=GET status=200 url=/static/assets/tw-elements.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="36.459µs" method=GET status=200 url=/static/assets/fontawesome/css/fontawesome.css
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="232.125µs" method=GET status=200 url=/static/assets/alpine.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="145.25µs" method=GET status=200 url=/static/assets/marked.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="494.875µs" method=GET status=200 url=/static/assets/purify.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="105.625µs" method=GET status=200 url=/static/chat.js
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="162.375µs" method=GET status=200 url=/static/assets/fontawesome/webfonts/fa-solid-900.woff2
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="125.834µs" method=GET status=200 url=/static/assets/UcCO3FwrK3iLTeHuS_fvQtMwCp50KnMw2boKoduKmMEVuFuYMZg.ttf
api-1 | 9:28AM INF Success ip=172.18.0.1 latency=1.243208ms method=GET status=200 url=/static/assets/UcCO3FwrK3iLTeHuS_fvQtMwCp50KnMw2boKoduKmMEVuLyfMZg.ttf
api-1 | 9:28AM INF Success ip=172.18.0.1 latency="144.583µs" method=GET status=200 url=/static/assets/UcCO3FwrK3iLTeHuS_fvQtMwCp50KnMw2boKoduKmMEVuGKYMZg.ttf
api-1 | 9:28AM DBG Request received: {"model":"gpt-4","language":"","n":0,"top_p":null,"top_k":null,"temperature":null,"max_tokens":null,"echo":false,"batch":0,"ignore_eos":false,"repeat_penalty":0,"n_keep":0,"frequency_penalty":0,"presence_penalty":0,"tfz":null,"typical_p":null,"seed":null,"negative_prompt":"","rope_freq_base":0,"rope_freq_scale":0,"negative_prompt_scale":0,"use_fast_tokenizer":false,"clip_skip":0,"tokenizer":"","file":"","response_format":{},"size":"","prompt":null,"instruction":"","input":null,"stop":null,"messages":[{"role":"user","content":"hello how are you"}],"functions":null,"function_call":null,"stream":true,"mode":0,"step":0,"grammar":"","grammar_json_functions":null,"grammar_json_name":null,"backend":"","model_base_name":""}
api-1 | 9:28AM DBG Configuration read: &{PredictionOptions:{Model:b5869d55688a529c3738cb044e92c331 Language: N:0 TopP:0xc0008a9fb8 TopK:0xc0008a9fc0 Temperature:0xc0008a9fc8 Maxtokens:0xc0008a9ff8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0008a9ff0 TypicalP:0xc0008a9fe8 Seed:0xc0012f0010 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0008a9fb0 Threads:0xc0008a9fa8 Debug:0xc00032aa30 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:{{.Input -}}
api-1 | <|im_start|>assistant
api-1 | ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
api-1 | {{- if .FunctionCall }}
api-1 | <tool_call>
api-1 | {{- else if eq .RoleName "tool" }}
api-1 | <tool_response>
api-1 | {{- end }}
api-1 | {{- if .Content}}
api-1 | {{.Content }}
api-1 | {{- end }}
api-1 | {{- if .FunctionCall}}
api-1 | {{toJson .FunctionCall}}
api-1 | {{- end }}
api-1 | {{- if .FunctionCall }}
api-1 | </tool_call>
api-1 | {{- else if eq .RoleName "tool" }}
api-1 | </tool_response>
api-1 | {{- end }}<|im_end|>
api-1 | Completion:{{.Input}}
api-1 | Edit: Functions:<|im_start|>system
api-1 | You are a function calling AI model.
api-1 | Here are the available tools:
api-1 |
api-1 | {{range .Functions}}
api-1 | {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
api-1 | {{end}}
api-1 |
api-1 | You should call the tools provided to you sequentially
api-1 | Please use XML tags to record your reasoning and planning before you call the functions as follows:
api-1 |
api-1 | {step-by-step reasoning and plan in bullet points}
api-1 |
api-1 | For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
api-1 | <tool_call>
api-1 | {"arguments": , "name": }
api-1 | </tool_call><|im_end|>
api-1 | {{.Input -}}
api-1 | <|im_start|>assistant UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:true NoMixedFreeString:false NoGrammar:false Prefix:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex: JSONRegexMatch:[(?s)<tool_call>(.?)</tool_call> (?s)<tool_call>(.?)] ReplaceFunctionResults:[{Key:(?s)^[^{[]* Value:} {Key:(?s)[^}]]$ Value:} {Key:'([^']?)' Value:DQUOTE${1}DQUOTE} {Key:\" Value:TEMP_QUOTE} {Key:' Value:'} {Key:DQUOTE Value:"} {Key:TEMP_QUOTE Value:"} {Key:(?s). Value:}] ReplaceLLMResult:[{Key:(?s). Value:}] FunctionName:true} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0008a9fe0 MirostatTAU:0xc0008a9fd8 Mirostat:0xc0008a9fd0 NGPULayers:0xc0012f0000 MMap:0xc0008a9ec8 MMlock:0xc0012f0009 LowVRAM:0xc0012f0009 Grammar: StopWords:[<|im_end|> </tool_call> <|eot_id|> <|end_of_text|>] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0008a9ed0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
api-1 | 9:28AM DBG Parameters: &{PredictionOptions:{Model:b5869d55688a529c3738cb044e92c331 Language: N:0 TopP:0xc0008a9fb8 TopK:0xc0008a9fc0 Temperature:0xc0008a9fc8 Maxtokens:0xc0008a9ff8 Echo:false Batch:0 IgnoreEOS:false RepeatPenalty:0 Keep:0 FrequencyPenalty:0 PresencePenalty:0 TFZ:0xc0008a9ff0 TypicalP:0xc0008a9fe8 Seed:0xc0012f0010 NegativePrompt: RopeFreqBase:0 RopeFreqScale:0 NegativePromptScale:0 UseFastTokenizer:false ClipSkip:0 Tokenizer:} Name:gpt-4 F16:0xc0008a9fb0 Threads:0xc0008a9fa8 Debug:0xc00032aa30 Roles:map[] Embeddings:false Backend: TemplateConfig:{Chat:{{.Input -}}
api-1 | <|im_start|>assistant
api-1 | ChatMessage:<|im_start|>{{if eq .RoleName "assistant"}}assistant{{else if eq .RoleName "system"}}system{{else if eq .RoleName "tool"}}tool{{else if eq .RoleName "user"}}user{{end}}
api-1 | {{- if .FunctionCall }}
api-1 | <tool_call>
api-1 | {{- else if eq .RoleName "tool" }}
api-1 | <tool_response>
api-1 | {{- end }}
api-1 | {{- if .Content}}
api-1 | {{.Content }}
api-1 | {{- end }}
api-1 | {{- if .FunctionCall}}
api-1 | {{toJson .FunctionCall}}
api-1 | {{- end }}
api-1 | {{- if .FunctionCall }}
api-1 | </tool_call>
api-1 | {{- else if eq .RoleName "tool" }}
api-1 | </tool_response>
api-1 | {{- end }}<|im_end|>
api-1 | Completion:{{.Input}}
api-1 | Edit: Functions:<|im_start|>system
api-1 | You are a function calling AI model.
api-1 | Here are the available tools:
api-1 |
api-1 | {{range .Functions}}
api-1 | {'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
api-1 | {{end}}
api-1 |
api-1 | You should call the tools provided to you sequentially
api-1 | Please use XML tags to record your reasoning and planning before you call the functions as follows:
api-1 |
api-1 | {step-by-step reasoning and plan in bullet points}
api-1 |
api-1 | For each function call return a json object with function name and arguments within <tool_call> XML tags as follows:
api-1 | <tool_call>
api-1 | {"arguments": , "name": }
api-1 | </tool_call><|im_end|>
api-1 | {{.Input -}}
api-1 | <|im_start|>assistant UseTokenizerTemplate:false JoinChatMessagesByCharacter:} PromptStrings:[] InputStrings:[] InputToken:[] functionCallString: functionCallNameString: FunctionsConfig:{DisableNoAction:true GrammarConfig:{ParallelCalls:false DisableParallelNewLines:false MixedMode:true NoMixedFreeString:false NoGrammar:false Prefix:} NoActionFunctionName: NoActionDescriptionName: ResponseRegex: JSONRegexMatch:[(?s)<tool_call>(.?)</tool_call> (?s)<tool_call>(.?)] ReplaceFunctionResults:[{Key:(?s)^[^{[]* Value:} {Key:(?s)[^}]]$ Value:} {Key:'([^']?)' Value:DQUOTE${1}DQUOTE} {Key:\" Value:TEMP_QUOTE} {Key:' Value:'} {Key:DQUOTE Value:"} {Key:TEMP_QUOTE Value:"} {Key:(?s). Value:}] ReplaceLLMResult:[{Key:(?s). Value:}] FunctionName:true} FeatureFlag:map[] LLMConfig:{SystemPrompt: TensorSplit: MainGPU: RMSNormEps:0 NGQA:0 PromptCachePath: PromptCacheAll:false PromptCacheRO:false MirostatETA:0xc0008a9fe0 MirostatTAU:0xc0008a9fd8 Mirostat:0xc0008a9fd0 NGPULayers:0xc0012f0000 MMap:0xc0008a9ec8 MMlock:0xc0012f0009 LowVRAM:0xc0012f0009 Grammar: StopWords:[<|im_end|> </tool_call> <|eot_id|> <|end_of_text|>] Cutstrings:[] TrimSpace:[] TrimSuffix:[] ContextSize:0xc0008a9ed0 NUMA:false LoraAdapter: LoraBase: LoraScale:0 NoMulMatQ:false DraftModel: NDraft:0 Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: FlashAttention:false NoKVOffloading:false RopeScaling: ModelType: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0} AutoGPTQ:{ModelBaseName: Device: Triton:false UseFastTokenizer:false} Diffusers:{CUDA:false PipelineType: SchedulerType: EnableParameters: CFGScale:0 IMG2IMG:false ClipSkip:0 ClipModel: ClipSubFolder: ControlNet:} Step:0 GRPC:{Attempts:0 AttemptsSleepTime:0} VallE:{AudioPath:} CUDA:false DownloadFiles:[] Description: Usage:}
api-1 | 9:28AM DBG templated message for chat: <|im_start|>user
api-1 | hello how are you<|im_end|>
api-1 |
api-1 | 9:28AM DBG Prompt (before templating): <|im_start|>user
api-1 | hello how are you<|im_end|>
api-1 |
api-1 | 9:28AM DBG Template found, input modified to: <|im_start|>user
api-1 | hello how are you<|im_end|>
api-1 | <|im_start|>assistant
api-1 |
api-1 | 9:28AM DBG Prompt (after templating): <|im_start|>user
api-1 | hello how are you<|im_end|>
api-1 | <|im_start|>assistant
api-1 |
api-1 | 9:28AM DBG Stream request received
api-1 | 9:28AM INF Success ip=172.18.0.1 latency=20.585875ms method=POST status=200 url=/v1/chat/completions
api-1 | 9:28AM DBG Sending chunk: {"created":1718097931,"object":"chat.completion.chunk","id":"5f8ccf99-dd06-46ba-90af-3de57d022fa2","model":"gpt-4","choices":[{"index":0,"finish_reason":"","delta":{"role":"assistant","content":""}}],"usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
api-1 |
api-1 | 9:28AM DBG Loading from the following backends (in order): [llama-cpp llama-ggml gpt4all llama-cpp-fallback rwkv stablediffusion piper whisper huggingface bert-embeddings /build/backend/python/vllm/run.sh /build/backend/python/exllama/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/exllama2/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/bark/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/mamba/run.sh /build/backend/python/coqui/run.sh /build/backend/python/petals/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/transformers/run.sh /build/backend/python/transformers-musicgen/run.sh]
api-1 | 9:28AM INF Trying to load the model 'b5869d55688a529c3738cb044e92c331' with the backend '[llama-cpp llama-ggml gpt4all llama-cpp-fallback rwkv stablediffusion piper whisper huggingface bert-embeddings /build/backend/python/vllm/run.sh /build/backend/python/exllama/run.sh /build/backend/python/openvoice/run.sh /build/backend/python/diffusers/run.sh /build/backend/python/exllama2/run.sh /build/backend/python/parler-tts/run.sh /build/backend/python/bark/run.sh /build/backend/python/vall-e-x/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/autogptq/run.sh /build/backend/python/sentencetransformers/run.sh /build/backend/python/mamba/run.sh /build/backend/python/coqui/run.sh /build/backend/python/petals/run.sh /build/backend/python/rerankers/run.sh /build/backend/python/transformers/run.sh /build/backend/python/transformers-musicgen/run.sh]'
api-1 | 9:28AM INF [llama-cpp] Attempting to load
api-1 | 9:28AM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend llama-cpp
api-1 | 9:28AM DBG Loading model in memory from file: /build/models/b5869d55688a529c3738cb044e92c331
api-1 | 9:28AM DBG Loading Model b5869d55688a529c3738cb044e92c331 with gRPC (file: /build/models/b5869d55688a529c3738cb044e92c331) (backend: llama-cpp): {backendString:llama-cpp model:b5869d55688a529c3738cb044e92c331 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000456008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index1/size: no such file or directory
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index1/size: no such file or directory
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index2/size: no such file or directory
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index1/size: no such file or directory
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index1/size: no such file or directory
api-1 | WARNING: open /sys/devices/system/node/node0/cpu0/cache/index2/size: no such file or directory
api-1 | 9:28AM INF [llama-cpp] attempting to load with fallback variant
api-1 | 9:28AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback
api-1 | 9:28AM DBG GRPC Service for b5869d55688a529c3738cb044e92c331 will be running at: '127.0.0.1:37001'
api-1 | 9:28AM DBG GRPC Service state dir: /tmp/go-processmanager2171005986
api-1 | 9:28AM DBG GRPC Service Started
api-1 | 9:28AM INF Success ip=127.0.0.1 latency="153.625µs" method=GET status=200 url=/readyz
api-1 | 9:29AM ERR failed starting/connecting to the gRPC service error="rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:37001: connect: connection refused""
api-1 | 9:29AM DBG GRPC Service NOT ready
api-1 | 9:29AM INF [llama-cpp] Fails: grpc service not ready
api-1 | 9:29AM INF [llama-ggml] Attempting to load
api-1 | 9:29AM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend llama-ggml
api-1 | 9:29AM DBG Loading model in memory from file: /build/models/b5869d55688a529c3738cb044e92c331
api-1 | 9:29AM DBG Loading Model b5869d55688a529c3738cb044e92c331 with gRPC (file: /build/models/b5869d55688a529c3738cb044e92c331) (backend: llama-ggml): {backendString:llama-ggml model:b5869d55688a529c3738cb044e92c331 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000456008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1 | 9:29AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-ggml
api-1 | 9:29AM DBG GRPC Service for b5869d55688a529c3738cb044e92c331 will be running at: '127.0.0.1:41757'
api-1 | 9:29AM DBG GRPC Service state dir: /tmp/go-processmanager1324711957
api-1 | 9:29AM DBG GRPC Service Started
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr 2024/06/11 09:29:13 gRPC Server listening at 127.0.0.1:41757
api-1 | 9:29AM DBG GRPC Service Ready
api-1 | 9:29AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:} sizeCache:0 unknownFields:[] Model:b5869d55688a529c3738cb044e92c331 ContextSize:8192 Seed:2044693788 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath: RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/build/models/b5869d55688a529c3738cb044e92c331 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr create_gpt_params: loading model /build/models/b5869d55688a529c3738cb044e92c331
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr SIGILL: illegal instruction
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr PC=0x8afdaa m=3 sigcode=2
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr signal arrived during cgo execution
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr instruction bytes: 0xc4 0xe2 0x79 0x13 0xc9 0xc5 0xf2 0x59 0x15 0xd9 0xd3 0x20 0x0 0xc4 0x81 0x7a
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 22 gp=0xc000105500 m=3 mp=0xc000059008 [syscall]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.cgocall(0x843980, 0xc00009b640)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/cgocall.go:157 +0x4b fp=0xc00009b618 sp=0xc00009b5e0 pc=0x4145eb
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr github.com/go-skynet/go-llama%2ecpp._Cfunc_load_model(0x4054000b70, 0x2000, 0x0, 0x0, 0x0, 0x0, 0x1, 0x0, 0x5f5e0ff, 0x200, ...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr _cgo_gotypes.go:254 +0x4c fp=0xc00009b640 sp=0xc00009b618 pc=0x83754c
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr github.com/go-skynet/go-llama%2ecpp.New({0xc00012e210, 0x2e}, {0xc000139240, 0x8, 0x92c8c0?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/sources/go-llama.cpp/llama.go:28 +0x28a fp=0xc00009b7c0 sp=0xc00009b640 pc=0x837cea
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr main.(*LLM).Load(0xc000036990, 0xc0001a0248)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/backend/go/llm/llama-ggml/llama.go:73 +0x92e fp=0xc00009b900 sp=0xc00009b7c0 pc=0x840e4e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr github.com/go-skynet/LocalAI/pkg/grpc.(*server).LoadModel(0xc000036a40, {0x9b44e0?, 0xc0000649e8?}, 0xc0001a0248)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/pkg/grpc/server.go:50 +0xe7 fp=0xc00009b9a8 sp=0xc00009b900 pc=0x83dba7
api-1 | 9:29AM INF [llama-ggml] Fails: could not load model: rpc error: code = Unavailable desc = error reading from server: EOF
api-1 | 9:29AM INF [gpt4all] Attempting to load
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr github.com/go-skynet/LocalAI/pkg/grpc/proto._Backend_LoadModel_Handler({0x9b44e0, 0xc000036a40}, {0xa975e0, 0xc0001fa9c0}, 0xc000162e80, 0x0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/pkg/grpc/proto/backend_grpc.pb.go:352 +0x1a6 fp=0xc00009b9f8 sp=0xc00009b9a8 pc=0x831b26
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).processUnaryRPC(0xc0001ac1e0, {0xa975e0, 0xc0001fa900}, {0xa9afc0, 0xc0000ce9c0}, 0xc0001406c0, 0xc00021a9f0, 0xdd71f0, 0x0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:1343 +0xdd1 fp=0xc00009bdf0 sp=0xc00009b9f8 pc=0x818a51
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).handleStream(0xc0001ac1e0, {0xa9afc0, 0xc0000ce9c0}, 0xc0001406c0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:1737 +0xc47 fp=0xc00009bf78 sp=0xc00009bdf0 pc=0x81da07
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).serveStreams.func1.1()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:986 +0x86 fp=0xc00009bfe0 sp=0xc00009bf78 pc=0x816926
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc00009bfe8 sp=0xc00009bfe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by google.golang.org/grpc.(*Server).serveStreams.func1 in goroutine 10
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:997 +0x136
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 1 gp=0xc0000061c0 m=nil [IO wait]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0xc000040008?, 0x0?, 0xc0?, 0x61?, 0xc0001c5b60?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc0001c5b28 sp=0xc0001c5b08 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.netpollblock(0xc0001c5bc0?, 0x413d86?, 0x0?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:573 +0xf7 fp=0xc0001c5b60 sp=0xc0001c5b28 pc=0x443a17
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.runtime_pollWait(0x4061045f20, 0x72)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:345 +0x85 fp=0xc0001c5b80 sp=0xc0001c5b60 pc=0x477785
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*pollDesc).wait(0x3?, 0x1?, 0x0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc0001c5ba8 sp=0xc0001c5b80 pc=0x4e1fa7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*pollDesc).waitRead(...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:89
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*FD).Accept(0xc0000fa400)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_unix.go:611 +0x2ac fp=0xc0001c5c50 sp=0xc0001c5ba8 pc=0x4e734c
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*netFD).accept(0xc0000fa400)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/net/fd_unix.go:172 +0x29 fp=0xc0001c5d08 sp=0xc0001c5c50 pc=0x56bdc9
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*TCPListener).accept(0xc0000722e0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock_posix.go:159 +0x1e fp=0xc0001c5d30 sp=0xc0001c5d08 pc=0x58317e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*TCPListener).Accept(0xc0000722e0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/net/tcpsock.go:327 +0x30 fp=0xc0001c5d60 sp=0xc0001c5d30 pc=0x582370
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).Serve(0xc0001ac1e0, {0xa96bf0, 0xc0000722e0})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:852 +0x469 fp=0xc0001c5e98 sp=0xc0001c5d60 pc=0x8155a9
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr github.com/go-skynet/LocalAI/pkg/grpc.StartServer({0x4000800965?, 0xc000116250?}, {0xa9d2d0, 0xc000036990})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/pkg/grpc/server.go:226 +0x170 fp=0xc0001c5f20 sp=0xc0001c5e98 pc=0x840270
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr main.main()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /build/backend/go/llm/llama-ggml/main.go:16 +0x85 fp=0xc0001c5f50 sp=0xc0001c5f20 pc=0x842ea5
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.main()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:271 +0x29d fp=0xc0001c5fe0 sp=0xc0001c5f50 pc=0x44a7dd
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0001c5fe8 sp=0xc0001c5fe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 2 gp=0xc000006c40 m=nil [force gc (idle)]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000052fa8 sp=0xc000052f88 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goparkunlock(...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:408
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.forcegchelper()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:326 +0xb3 fp=0xc000052fe0 sp=0xc000052fa8 pc=0x44aa93
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000052fe8 sp=0xc000052fe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by runtime.init.6 in goroutine 1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:314 +0x1a
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 3 gp=0xc000007500 m=nil [GC sweep wait]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0x0?, 0x0?, 0x0?, 0x0?, 0x0?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000053780 sp=0xc000053760 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goparkunlock(...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:408
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.bgsweep(0xc00007c000)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcsweep.go:278 +0x94 fp=0xc0000537c8 sp=0xc000053780 pc=0x436214
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gcenable.gowrap1()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:203 +0x25 fp=0xc0000537e0 sp=0xc0000537c8 pc=0x42ab65
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000537e8 sp=0xc0000537e0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by runtime.gcenable in goroutine 1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:203 +0x66
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 4 gp=0xc000007a40 m=nil [GC scavenge wait]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0xc00007c000?, 0xa8f0e0?, 0x1?, 0x0?, 0xc000007a40?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000053f78 sp=0xc000053f58 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goparkunlock(...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:408
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.(*scavengerState).park(0xe24180)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:425 +0x49 fp=0xc000053fa8 sp=0xc000053f78 pc=0x433c09
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.bgscavenge(0xc00007c000)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgcscavenge.go:653 +0x3c fp=0xc000053fc8 sp=0xc000053fa8 pc=0x43419c
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gcenable.gowrap2()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0x25 fp=0xc000053fe0 sp=0xc000053fc8 pc=0x42ab05
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000053fe8 sp=0xc000053fe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by runtime.gcenable in goroutine 1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mgc.go:204 +0xa5
api-1 | 9:29AM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend gpt4all
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 18 gp=0xc000104380 m=nil [finalizer wait]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0xc000052648?, 0x41e245?, 0xa8?, 0x1?, 0xc0000061c0?)
api-1 | 9:29AM DBG Loading model in memory from file: /build/models/b5869d55688a529c3738cb044e92c331
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000052620 sp=0xc000052600 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.runfinq()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:194 +0x107 fp=0xc0000527e0 sp=0xc000052620 pc=0x429ba7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000527e8 sp=0xc0000527e0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by runtime.createfing in goroutine 1
api-1 | 9:29AM DBG Loading Model b5869d55688a529c3738cb044e92c331 with gRPC (file: /build/models/b5869d55688a529c3738cb044e92c331) (backend: gpt4all): {backendString:gpt4all model:b5869d55688a529c3738cb044e92c331 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000456008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/mfinal.go:164 +0x3d
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 8 gp=0xc000105880 m=nil [select]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0xc000251f00?, 0x2?, 0x1e?, 0x0?, 0xc000251ed4?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000251d80 sp=0xc000251d60 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.selectgo(0xc000251f00, 0xc000251ed0, 0x7b0236?, 0x0, 0xc000240000?, 0x1)
api-1 | 9:29AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/gpt4all
api-1 | 9:29AM DBG GRPC Service for b5869d55688a529c3738cb044e92c331 will be running at: '127.0.0.1:34885'
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/select.go:327 +0x725 fp=0xc000251ea0 sp=0xc000251d80 pc=0x45bf85
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.(*controlBuffer).get(0xc0000945a0, 0x1)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:418 +0x113 fp=0xc000251f30 sp=0xc000251ea0 pc=0x78f153
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.(*loopyWriter).run(0xc000200380)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/controlbuf.go:552 +0x86 fp=0xc000251f90 sp=0xc000251f30 pc=0x78f8a6
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.NewServerTransport.func2()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:336 +0xd5 fp=0xc000251fe0 sp=0xc000251f90 pc=0x7a6195
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000251fe8 sp=0xc000251fe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:333 +0x1a8c
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 9 gp=0xc00022e380 m=nil [select]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0xc000055740?, 0x4?, 0xf0?, 0x55?, 0xc0000556c0?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000055558 sp=0xc000055538 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.selectgo(0xc000055740, 0xc0000556b8, 0x0?, 0x0, 0x0?, 0x1)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/select.go:327 +0x725 fp=0xc000055678 sp=0xc000055558 pc=0x45bf85
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.(*http2Server).keepalive(0xc0000ce9c0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:1152 +0x205 fp=0xc0000557c8 sp=0xc000055678 pc=0x7ad2e5
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.NewServerTransport.gowrap1()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:339 +0x25 fp=0xc0000557e0 sp=0xc0000557c8 pc=0x7a6085
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc0000557e8 sp=0xc0000557e0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by google.golang.org/grpc/internal/transport.NewServerTransport in goroutine 7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:339 +0x1ace
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr goroutine 10 gp=0xc00022e540 m=nil [IO wait]:
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.gopark(0x9963a0?, 0xc0001fa8d0?, 0x6?, 0x0?, 0xb?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/proc.go:402 +0xce fp=0xc000063ab0 sp=0xc000063a90 pc=0x44ac0e
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.netpollblock(0x4c8078?, 0x413d86?, 0x0?)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:573 +0xf7 fp=0xc000063ae8 sp=0xc000063ab0 pc=0x443a17
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.runtime_pollWait(0x4061045e28, 0x72)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/netpoll.go:345 +0x85 fp=0xc000063b08 sp=0xc000063ae8 pc=0x477785
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*pollDesc).wait(0xc0000fa500?, 0xc000238000?, 0x0)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:84 +0x27 fp=0xc000063b30 sp=0xc000063b08 pc=0x4e1fa7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*pollDesc).waitRead(...)
api-1 | 9:29AM DBG GRPC Service state dir: /tmp/go-processmanager626580063
api-1 | 9:29AM DBG GRPC Service Started
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_poll_runtime.go:89
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr internal/poll.(*FD).Read(0xc0000fa500, {0xc000238000, 0x8000, 0x8000})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/internal/poll/fd_unix.go:164 +0x27a fp=0xc000063bc8 sp=0xc000063b30 pc=0x4e329a
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*netFD).Read(0xc0000fa500, {0xc000238000?, 0x1060100000000?, 0x8?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/net/fd_posix.go:55 +0x25 fp=0xc000063c10 sp=0xc000063bc8 pc=0x569de5
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*conn).Read(0xc0000562d8, {0xc000238000?, 0x800010601?, 0x0?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/net/net.go:179 +0x45 fp=0xc000063c58 sp=0xc000063c10 pc=0x57a385
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr net.(*TCPConn).Read(0x0?, {0xc000238000?, 0xc000063cb0?, 0x46a5ed?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr :1 +0x25 fp=0xc000063c88 sp=0xc000063c58 pc=0x58c4a5
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr bufio.(*Reader).Read(0xc0000924e0, {0xc000248040, 0x9, 0xc00007e008?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/bufio/bufio.go:241 +0x197 fp=0xc000063cc0 sp=0xc000063c88 pc=0x52f657
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr io.ReadAtLeast({0xa94260, 0xc0000924e0}, {0xc000248040, 0x9, 0x9}, 0x9)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/io/io.go:335 +0x90 fp=0xc000063d08 sp=0xc000063cc0 pc=0x4c1eb0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr io.ReadFull(...)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/io/io.go:354
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr golang.org/x/net/http2.readFrameHeader({0xc000248040, 0x9, 0xc00002c468?}, {0xa94260?, 0xc0000924e0?})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:237 +0x65 fp=0xc000063d58 sp=0xc000063d08 pc=0x77c2c5
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr golang.org/x/net/http2.(*Framer).ReadFrame(0xc000248000)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/x/[email protected]/http2/frame.go:498 +0x85 fp=0xc000063e00 sp=0xc000063d58 pc=0x77ca05
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc/internal/transport.(*http2Server).HandleStreams(0xc0000ce9c0, 0xc00021af90)
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/internal/transport/http2_server.go:636 +0x145 fp=0xc000063f08 sp=0xc000063e00 pc=0x7a9225
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).serveStreams(0xc0001ac1e0, {0xa9afc0, 0xc0000ce9c0})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:979 +0x1aa fp=0xc000063f80 sp=0xc000063f08 pc=0x8166ea
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr google.golang.org/grpc.(*Server).handleRawConn.func1()
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:920 +0x45 fp=0xc000063fe0 sp=0xc000063f80 pc=0x815f45
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr runtime.goexit({})
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_amd64.s:1695 +0x1 fp=0xc000063fe8 sp=0xc000063fe0 pc=0x47c8c1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr created by google.golang.org/grpc.(*Server).handleRawConn in goroutine 7
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr /root/go/pkg/mod/google.golang.org/[email protected]/server.go:919 +0x15b
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rax 0x0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rbx 0xeca4c0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rcx 0x6534343062633833
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rdx 0x3133336332396534
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rdi 0x1
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rsi 0x405083f810
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rbp 0xeea4c0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rsp 0x405083f7f0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r8 0x6d2f646c6975622f
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r9 0x35622f736c65646f
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r10 0x4000b973c8
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r11 0x4000c66650
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r12 0xf0a4c0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r13 0xf2a4c0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r14 0xe8a4c0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr r15 0x0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rip 0x8afdaa
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr rflags 0x246
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr cs 0x33
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr fs 0x0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:41757): stderr gs 0x0
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:34885): stderr 2024/06/11 09:29:15 gRPC Server listening at 127.0.0.1:34885
api-1 | 9:29AM DBG GRPC Service Ready
api-1 | 9:29AM DBG GRPC: Loading model with options: {state:{NoUnkeyedLiterals:{} DoNotCompare:[] DoNotCopy:[] atomicMessageInfo:} sizeCache:0 unknownFields:[] Model:b5869d55688a529c3738cb044e92c331 ContextSize:8192 Seed:2044693788 NBatch:512 F16Memory:false MLock:false MMap:true VocabOnly:false LowVRAM:false Embeddings:false NUMA:false NGPULayers:99999999 MainGPU: TensorSplit: Threads:4 LibrarySearchPath:/tmp/localai/backend_data/backend-assets/gpt4all RopeFreqBase:0 RopeFreqScale:0 RMSNormEps:0 NGQA:0 ModelFile:/build/models/b5869d55688a529c3738cb044e92c331 Device: UseTriton:false ModelBaseName: UseFastTokenizer:false PipelineType: SchedulerType: CUDA:false CFGScale:0 IMG2IMG:false CLIPModel: CLIPSubfolder: CLIPSkip:0 ControlNet: Tokenizer: LoraBase: LoraAdapter: LoraScale:0 NoMulMatQ:false DraftModel: AudioPath: Quantization: GPUMemoryUtilization:0 TrustRemoteCode:false EnforceEager:false SwapSpace:0 MaxModelLen:0 TensorParallelSize:0 MMProj: RopeScaling: YarnExtFactor:0 YarnAttnFactor:0 YarnBetaFast:0 YarnBetaSlow:0 Type: FlashAttention:false NoKVOffload:false}
api-1 | 9:29AM DBG GRPC(b5869d55688a529c3738cb044e92c331-127.0.0.1:34885): stderr load_model: error 'Model format not supported (no matching implementation found)'
api-1 | 9:29AM INF [gpt4all] Fails: could not load model: rpc error: code = Unknown desc = failed loading model
api-1 | 9:29AM INF [llama-cpp-fallback] Attempting to load
api-1 | 9:29AM INF Loading model 'b5869d55688a529c3738cb044e92c331' with backend llama-cpp-fallback
api-1 | 9:29AM DBG Loading model in memory from file: /build/models/b5869d55688a529c3738cb044e92c331
api-1 | 9:29AM DBG Loading Model b5869d55688a529c3738cb044e92c331 with gRPC (file: /build/models/b5869d55688a529c3738cb044e92c331) (backend: llama-cpp-fallback): {backendString:llama-cpp-fallback model:b5869d55688a529c3738cb044e92c331 threads:4 assetDir:/tmp/localai/backend_data context:{emptyCtx:{}} gRPCOptions:0xc000456008 externalBackends:map[autogptq:/build/backend/python/autogptq/run.sh bark:/build/backend/python/bark/run.sh coqui:/build/backend/python/coqui/run.sh diffusers:/build/backend/python/diffusers/run.sh exllama:/build/backend/python/exllama/run.sh exllama2:/build/backend/python/exllama2/run.sh huggingface-embeddings:/build/backend/python/sentencetransformers/run.sh mamba:/build/backend/python/mamba/run.sh openvoice:/build/backend/python/openvoice/run.sh parler-tts:/build/backend/python/parler-tts/run.sh petals:/build/backend/python/petals/run.sh rerankers:/build/backend/python/rerankers/run.sh sentencetransformers:/build/backend/python/sentencetransformers/run.sh transformers:/build/backend/python/transformers/run.sh transformers-musicgen:/build/backend/python/transformers-musicgen/run.sh vall-e-x:/build/backend/python/vall-e-x/run.sh vllm:/build/backend/python/vllm/run.sh] grpcAttempts:20 grpcAttemptsDelay:2 singleActiveBackend:false parallelRequests:false}
api-1 | 9:29AM DBG Loading GRPC Process: /tmp/localai/backend_data/backend-assets/grpc/llama-cpp-fallback
api-1 | 9:29AM DBG GRPC Service for b5869d55688a529c3738cb044e92c331 will be running at: '127.0.0.1:34333'
api-1 | 9:29AM DBG GRPC Service state dir: /tmp/go-processmanager353054777
api-1 | 9:29AM DBG GRPC Service Started
api-1 | 9:29AM INF Success ip=127.0.0.1 latency="165.125µs" method=GET status=200 url=/readyz
So using the following docker-compose.yaml:
services:
api:
image: localai/localai:latest-aio-cpu
# For a specific version:
# image: localai/localai:v2.16.0-aio-cpu
# For Nvidia GPUs decomment one of the following (cuda11 or cuda12):
# image: localai/localai:v2.16.0-aio-gpu-nvidia-cuda-11
# image: localai/localai:v2.16.0-aio-gpu-nvidia-cuda-12
# image: localai/localai:latest-aio-gpu-nvidia-cuda-11
# image: localai/localai:latest-aio-gpu-nvidia-cuda-12
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/readyz"]
interval: 1m
timeout: 20m
retries: 5
ports:
- 8080:8080
environment:
- DEBUG=true
- REBUILD=true
# ...
volumes:
- ./models:/build/models:cached
# decomment the following piece if running with Nvidia GPUs
# deploy:
# resources:
# reservations:
# devices:
# - driver: nvidia
# count: 1
# capabilities: [gpu]
I ran the command : docker compose up
And here are the logs :
[+] Running 2/0
✔ Container localai-api-1 Recreated 0.0s
! api The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s
Attaching to api-1
api-1 | ===> LocalAI All-in-One (AIO) container starting...
api-1 | GPU acceleration is not enabled or supported. Defaulting to CPU.
api-1 | ===> Starting LocalAI[cpu] with the following models: /aio/cpu/embeddings.yaml,/aio/cpu/rerank.yaml,/aio/cpu/text-to-speech.yaml,/aio/cpu/image-gen.yaml,/aio/cpu/text-to-text.yaml,/aio/cpu/speech-to-text.yaml,/aio/cpu/vision.yaml
api-1 | go mod edit -replace github.com/donomii/go-rwkv.cpp=/build/sources/go-rwkv.cpp
api-1 | go mod edit -replace github.com/ggerganov/whisper.cpp=/build/sources/whisper.cpp
api-1 | go mod edit -replace github.com/ggerganov/whisper.cpp/bindings/go=/build/sources/whisper.cpp/bindings/go
api-1 | go mod edit -replace github.com/go-skynet/go-bert.cpp=/build/sources/go-bert.cpp
api-1 | go mod edit -replace github.com/M0Rf30/go-tiny-dream=/build/sources/go-tiny-dream
api-1 | go mod edit -replace github.com/mudler/go-piper=/build/sources/go-piper
api-1 | go mod edit -replace github.com/mudler/go-stable-diffusion=/build/sources/go-stable-diffusion
api-1 | go mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=/build/sources/gpt4all/gpt4all-bindings/golang
api-1 | go mod edit -replace github.com/go-skynet/go-llama.cpp=/build/sources/go-llama.cpp
api-1 | go mod download
api-1 | mkdir -p pkg/grpc/proto
api-1 | protoc -Ibackend/ --go_out=pkg/grpc/proto/ --go_opt=paths=source_relative --go-grpc_out=pkg/grpc/proto/ --go-grpc_opt=paths=source_relative \
api-1 | backend/backend.proto
api-1 | mkdir -p backend-assets/grpc
api-1 | go build -ldflags "-X "github.com/go-skynet/LocalAI/internal.Version=v2.16.0" -X "github.com/go-skynet/LocalAI/internal.Commit=e0187c2a1a4cde837398ada217d0ad161b7976d6"" -tags "" -o backend-assets/grpc/huggingface ./backend/go/llm/langchain/
api-1 | google.golang.org/grpc/internal/grpcsync: /root/go/pkg/mod/golang.org/[email protected]/pkg/tool/linux_amd64/compile: signal: segmentation fault
api-1 | make: *** [Makefile:648: backend-assets/grpc/huggingface] Error 1
api-1 exited with code 2
I deleted the container and image and reran the command:
[+] Running 23/23
✔ api 22 layers [⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿⣿] 0B/0B Pulled 225.8s
✔ a8b1c5f80c2d Pull complete 10.2s
✔ 405684db4ae6 Pull complete 22.5s
✔ 4ff83e19be7d Pull complete 13.8s
✔ 2712b753b418 Pull complete 18.1s
✔ 9ae961914d16 Pull complete 20.8s
✔ 264a672511ef Pull complete 22.9s
✔ 8c612aa911e2 Pull complete 25.5s
✔ 4f4fb700ef54 Pull complete 25.5s
✔ e2ad56550d99 Pull complete 56.4s
✔ 39cd8576d808 Pull complete 30.4s
✔ 5ba0f74e3aa0 Pull complete 30.4s
✔ ad213cf231b1 Pull complete 38.7s
✔ 6d32f2479c87 Pull complete 37.0s
✔ 5ecfd81d3634 Pull complete 176.6s
✔ e416a0c31998 Pull complete 62.6s
✔ 7fdefb701692 Pull complete 120.4s
✔ 68ff0c1f4076 Pull complete 89.7s
✔ dd60b6054961 Pull complete 96.6s
✔ 87419acb68c3 Pull complete 102.8s
✔ a95756a22567 Pull complete 107.8s
✔ 5826a814ef66 Pull complete 116.5s
✔ 43fa8fdbe65f Pull complete 121.3s
[+] Running 2/1
✔ Container localai-api-1 Created 0.1s
! api The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s
Attaching to api-1
api-1 | ===> LocalAI All-in-One (AIO) container starting...
api-1 | GPU acceleration is not enabled or supported. Defaulting to CPU.
api-1 | ===> Starting LocalAI[cpu] with the following models: /aio/cpu/embeddings.yaml,/aio/cpu/rerank.yaml,/aio/cpu/text-to-speech.yaml,/aio/cpu/image-gen.yaml,/aio/cpu/text-to-text.yaml,/aio/cpu/speech-to-text.yaml,/aio/cpu/vision.yaml
api-1 | go mod edit -replace github.com/donomii/go-rwkv.cpp=/build/sources/go-rwkv.cpp
api-1 | go mod edit -replace github.com/ggerganov/whisper.cpp=/build/sources/whisper.cpp
api-1 | go mod edit -replace github.com/ggerganov/whisper.cpp/bindings/go=/build/sources/whisper.cpp/bindings/go
api-1 | go mod edit -replace github.com/go-skynet/go-bert.cpp=/build/sources/go-bert.cpp
api-1 | go mod edit -replace github.com/M0Rf30/go-tiny-dream=/build/sources/go-tiny-dream
api-1 | go mod edit -replace github.com/mudler/go-piper=/build/sources/go-piper
api-1 | go mod edit -replace github.com/mudler/go-stable-diffusion=/build/sources/go-stable-diffusion
api-1 | go mod edit -replace github.com/nomic-ai/gpt4all/gpt4all-bindings/golang=/build/sources/gpt4all/gpt4all-bindings/golang
api-1 | go mod edit -replace github.com/go-skynet/go-llama.cpp=/build/sources/go-llama.cpp
api-1 | go mod download
api-1 | mkdir -p pkg/grpc/proto
api-1 | protoc -Ibackend/ --go_out=pkg/grpc/proto/ --go_opt=paths=source_relative --go-grpc_out=pkg/grpc/proto/ --go-grpc_opt=paths=source_relative \
api-1 | backend/backend.proto
api-1 | mkdir -p backend-assets/grpc
api-1 | go build -ldflags "-X "github.com/go-skynet/LocalAI/internal.Version=v2.16.0" -X "github.com/go-skynet/LocalAI/internal.Commit=e0187c2a1a4cde837398ada217d0ad161b7976d6"" -tags "" -o backend-assets/grpc/huggingface ./backend/go/llm/langchain/
api-1 | crypto/internal/nistec: /root/go/pkg/mod/golang.org/[email protected]/pkg/tool/linux_amd64/compile: signal: segmentation fault
api-1 | mime: /root/go/pkg/mod/golang.org/[email protected]/pkg/tool/linux_amd64/compile: signal: segmentation fault
api-1 | make: *** [Makefile:648: backend-assets/grpc/huggingface] Error 1
api-1 exited with code 2
So weird yesterday it started at least is it because of the REBUILD flag ?
Ok so i read this in the doc :
If running on Apple Silicon (ARM) it is not suggested to run on Docker due to emulation. Follow the build instructions to use Metal acceleration for full GPU support.
I have a macbook m1 pro 16gb so i gave it. a try :
$ /usr/bin/xcodebuild -version
Xcode 15.4
Build version 15F31d
$python --version
Python 3.11.5
$brew install abseil cmake go grpc protobuf protoc-gen-go protoc-gen-go-grpc wget
==> Downloading https://formulae.brew.sh/api/formula.jws.json
################################################################################################################################################################################################## 100.0%
==> Downloading https://formulae.brew.sh/api/cask.jws.json
################################################################################################################################################################################################## 100.0%
Warning: Treating cmake as a formula. For the cask, use homebrew/cask/cmake or specify the `--cask` flag.
Warning: abseil 20240116.2 is already installed and up-to-date.
To reinstall 20240116.2, run:
brew reinstall abseil
Warning: cmake 3.29.5 is already installed and up-to-date.
To reinstall 3.29.5, run:
brew reinstall cmake
Warning: go 1.22.4 is already installed and up-to-date.
To reinstall 1.22.4, run:
brew reinstall go
Warning: grpc 1.62.2_1 is already installed and up-to-date.
To reinstall 1.62.2_1, run:
brew reinstall grpc
Warning: protobuf 27.0 is already installed and up-to-date.
To reinstall 27.0, run:
brew reinstall protobuf
protoc-gen-go 1.34.1 is already installed but outdated (so it will be upgraded).
Warning: protoc-gen-go-grpc 1.4.0 is already installed and up-to-date.
To reinstall 1.4.0, run:
brew reinstall protoc-gen-go-grpc
Warning: wget 1.24.5 is already installed and up-to-date.
To reinstall 1.24.5, run:
brew reinstall wget
==> Downloading https://ghcr.io/v2/homebrew/core/protoc-gen-go/manifests/1.34.2
################################################################################################################################################################################################## 100.0%
==> Fetching protoc-gen-go
==> Downloading https://ghcr.io/v2/homebrew/core/protoc-gen-go/blobs/sha256:74a1e9415b32c7f9884a7bbdbcb981c4a74d8b7511bb4a5e101c25f915cd0556
################################################################################################################################################################################################## 100.0%
==> Upgrading protoc-gen-go
1.34.1 -> 1.34.2
==> Pouring protoc-gen-go--1.34.2.arm64_sonoma.bottle.tar.gz
🍺 /opt/homebrew/Cellar/protoc-gen-go/1.34.2: 6 files, 4.5MB
==> Running `brew cleanup protoc-gen-go`...
Disable this behaviour by setting HOMEBREW_NO_INSTALL_CLEANUP.
Hide these hints with HOMEBREW_NO_ENV_HINTS (see `man brew`).
Removing: /opt/homebrew/Cellar/protoc-gen-go/1.34.1... (6 files, 4.5MB)
Removing: /Users/gdenis/Library/Caches/Homebrew/protoc-gen-go_bottle_manifest--1.34.1... (9.8KB)
Removing: /Users/gdenis/Library/Caches/Homebrew/protoc-gen-go--1.34.1... (1.7MB)
$conda create -n localai python=3.11.5 -y
$pip install --user grpcio-tools
Collecting grpcio-tools
Downloading grpcio_tools-1.64.1-cp311-cp311-macosx_10_9_universal2.whl.metadata (5.3 kB)
Requirement already satisfied: protobuf<6.0dev,>=5.26.1 in /Users/gdenis/.local/lib/python3.11/site-packages (from grpcio-tools) (5.27.1)
Collecting grpcio>=1.64.1 (from grpcio-tools)
Downloading grpcio-1.64.1-cp311-cp311-macosx_10_9_universal2.whl.metadata (3.3 kB)
Requirement already satisfied: setuptools in /Users/gdenis/anaconda3/envs/localai/lib/python3.11/site-packages (from grpcio-tools) (69.5.1)
Downloading grpcio_tools-1.64.1-cp311-cp311-macosx_10_9_universal2.whl (5.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.3/5.3 MB 3.3 MB/s eta 0:00:00
Downloading grpcio-1.64.1-cp311-cp311-macosx_10_9_universal2.whl (10.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.4/10.4 MB 7.6 MB/s eta 0:00:00
Installing collected packages: grpcio, grpcio-tools
Successfully installed grpcio-1.64.1 grpcio-tools-1.64.1
$pwd
/Users/nomopo45/Documents/LocalAI
$make build
And it work!
Now does this command
make GO_TAGS=stablediffusion,tts build
would work ? separate GO_TAGS by a comma is ok ?
i have the following error with this command: make GO_TAGS=stablediffusion,tts build
I llama.cpp build info:
cp -rf overrides/* stable-diffusion/x86/vs2019_opencv-mobile_ncnn-dll_demo/vs2019_opencv-mobile_ncnn-dll_demo/
c++ -I./ncnn -I./ncnn/src -I./ncnn/build/src/ -I. -I./stable-diffusion/x86/vs2019_opencv-mobile_ncnn-dll_demo/vs2019_opencv-mobile_ncnn-dll_demo -O3 -DNDEBUG -std=c++17 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function stablediffusion.cpp -o stablediffusion.o -c
In file included from stablediffusion.cpp:9:
In file included from ./ncnn/src/net.h:18:
In file included from ./ncnn/src/blob.h:18:
./ncnn/src/mat.h:1164:9: warning: '_Atomic' is a C11 extension [-Wc11-extensions]
NCNN_XADD(m.refcount, 1);
^
./ncnn/src/allocator.h:125:56: note: expanded from macro 'NCNN_XADD'
#define NCNN_XADD(addr, delta) __c11_atomic_fetch_add((_Atomic(int)*)(addr), delta, __ATOMIC_ACQ_REL)
^
In file included from stablediffusion.cpp:9:
In file included from ./ncnn/src/net.h:18:
In file included from ./ncnn/src/blob.h:18:
./ncnn/src/mat.h:1188:9: warning: '_Atomic' is a C11 extension [-Wc11-extensions]
NCNN_XADD(refcount, 1);
^
./ncnn/src/allocator.h:125:56: note: expanded from macro 'NCNN_XADD'
#define NCNN_XADD(addr, delta) __c11_atomic_fetch_add((_Atomic(int)*)(addr), delta, __ATOMIC_ACQ_REL)
^
In file included from stablediffusion.cpp:9:
In file included from ./ncnn/src/net.h:18:
In file included from ./ncnn/src/blob.h:18:
./ncnn/src/mat.h:1193:21: warning: '_Atomic' is a C11 extension [-Wc11-extensions]
if (refcount && NCNN_XADD(refcount, -1) == 1)
^
./ncnn/src/allocator.h:125:56: note: expanded from macro 'NCNN_XADD'
#define NCNN_XADD(addr, delta) __c11_atomic_fetch_add((_Atomic(int)*)(addr), delta, __ATOMIC_ACQ_REL)
^
In file included from stablediffusion.cpp:11:
./stable-diffusion/x86/vs2019_opencv-mobile_ncnn-dll_demo/vs2019_opencv-mobile_ncnn-dll_demo/decoder_slover.h:11:10: fatal error: 'opencv2/opencv.hpp' file not found
#include <opencv2/opencv.hpp>
^~~~~~~~~~~~~~~~~~~~
3 warnings and 1 error generated.
make[1]: *** [stablediffusion.o] Error 1
make: *** [sources/go-stable-diffusion/libstablediffusion.a] Error 2
@nomopo45
If running on Apple Silicon (ARM) it is not suggested to run on Docker due to emulation. Follow the build instructions to use Metal acceleration for full GPU support.
Good point. I don't see this in the doc. Where is it?
I tried to search, the Search is not reliable. Search like "Apple Silicon" found nothing but mentioned in the build doc.
I found this:
For GPU Acceleration support for Nvidia video graphic cards, use the Nvidia/CUDA images, if you don’t have a GPU, use the CPU images. If you have AMD or Mac Silicon, see the build section.
from the container doc
and this:
In some cases you might want to re-build LocalAI from source (for instance to leverage Apple Silicon acceleration), or to build a custom container image with your own backends. This section contains instructions on how to build LocalAI from source.
from the build doc
Luckily I open and read this issue. Otherwise I never know that from Getting Started, skip container, straight to binary doc, I never know I need to build from source.
On Apple Silicon with metal it is always recommended to use the acceleration, otherwise I can see it doesn't utilize all the GPU. So the documentation in the Getting Started need some improvement.