LocalAI rpc error: code = ResourceExhausted desc = grpc: received message larger than max (400000002 vs. 4194304)

LocalAI version:

quay.io/go-skynet/local-ai:latest

Environment, CPU architecture, OS, and Version:

Linux localhost.localdomain 3.10.0-1160.99.1.el7.x86_64 #1 SMP Wed Sep 13 14:19:20 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Describe the bug

rpc error: code = ResourceExhausted desc = grpc: received message larger than max (400000002 vs. 4194304)

I am using https://github.com/ymcui/Chinese-LLaMA-Alpaca Download the Chinese Alpaca-13B model and convert it to ggml model ggml-model-q4_0.gguf file and added llama. yaml file in the models directory folder, with the following content: I started it in DEBGU mode

Using Postman, send the post command as shown in the following figure

The log output of LocalAI is as follows

What should I do,please

Oct 08 '23 09:10 yourtiger

pkg/grpc/Client.go pkg/grpc/Server.go

Oct 12 '23 07:10 yourtiger

I'm experiencing the same issue: trying to use embedding with a llama model and llama backend. I'd be keen to hear about any possible solutions.

Oct 15 '23 09:10 alexmccartneymoore

:warning::warning::warning::warning::warning:

Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!

but.... I can also be funny or helpful :smile_cat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.

Don't engage in conversation with me, I don't support (yet) replying!

:warning::warning::warning::warning::warning:

The error you are experiencing is likely due to the size of the input data being sent to LocalAI. The error message indicates that the received message was larger than the maximum allowed size. This could be caused by several factors such as the size of the input data or the number of iterations in the model. To troubleshoot this issue, you can try reducing the size of the input data and/or decreasing the number of iterations in the model. Additionally, you may want to check if your hardware has enough resources to handle the computation required by the model. If these steps do not resolve the issue, please provide more information about the specifics of the model you are using and the environment you are running LocalAI on. This will help us better understand the issue and provide further assistance.

Sources:

https://github.com/go-skynet/LocalAI/tree/master/cmd/grpc/stablediffusion/main.go
https://github.com/go-skynet/LocalAI/tree/master/CONTRIBUTING.md
https://github.com/go-skynet/LocalAI/tree/master/.github/ISSUE_TEMPLATE/bug_report.md

Oct 15 '23 13:10 localai-bot

I experienced the same issue: https://github.com/go-skynet/LocalAI/issues/1114

Oct 17 '23 07:10 dr-baem

yourtiger seems to have found a valid solution. Can someone propose a permanent Pull/Merge Request to fix the issue?

Oct 17 '23 07:10 dr-baem

ran into this issue today when sending a lot if images

[img-5][img-4][img-3][img-2][img-1][img-0]What are in these images? Is there any difference between them?<|eot_id|>

11:32AM DBG Prompt (before templating): <|start_header_id|>user<|end_header_id|>

[img-5][img-4][img-3][img-2][img-1][img-0]What are in these images? Is there any difference between them?<|eot_id|>

11:32AM DBG Template found, input modified to: <|start_header_id|>user<|end_header_id|>

[img-5][img-4][img-3][img-2][img-1][img-0]What are in these images? Is there any difference between them?<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>

11:32AM DBG Prompt (after templating): <|start_header_id|>user<|end_header_id|>

[img-5][img-4][img-3][img-2][img-1][img-0]What are in these images? Is there any difference between them?<|eot_id|>

<|start_header_id|>assistant<|end_header_id|>

11:32AM DBG Model already loaded in memory: llava-llama-3-8b-v1_1-int4.gguf 11:32AM DBG Model 'llava-llama-3-8b-v1_1-int4.gguf' already loaded 11:32AM ERR Server error error="rpc error: code = ResourceExhausted desc = SERVER: Received message larger than max (8237233 vs. 4194304)" ip=127.0.0.1 latency=351.209892ms method=POST status=500 url=/v1/chat/completions

Sep 06 '24 11:09 sfxworks

yep. just ran into it with v2.24.1 and a (as it seems) too big image and vllm with Isotr0py/Phi-3.5-vision-instruct-AWQ :(

Dec 11 '24 17:12 Nold360

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.

Nov 21 '25 02:11 github-actions[bot]

This issue was closed because it has been stalled for 5 days with no activity.

Nov 26 '25 02:11 github-actions[bot]