Changing of grpc parameters
Is your feature request related to a problem? Please describe. Together with a colleague we tried to get a model for embedding going:
Lama Embedding as described here: https://localai.io/features/embeddings/ We were getting the following error: "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (40000001 vs. 4194304)"
After doing some research I think this error is related to grpc Server/Client default parameters, which I think can so far not be changed from outside and might be too small for larger embeddings
Describe the solution you'd like
I would like to have the possibility to change grpc parameters when the server/clients are initialised.
I think this could be done extending s := grpc.NewServer() in line 175 of https://github.com/go-skynet/LocalAI/blob/master/pkg/grpc/server.go and for the clients respectivly
Describe alternatives you've considered Change defaults of message sizes used in the gprc framework to larger defaults/infinity
The lama model we tried to use the embedding of is: llama-2-7b-chat.ggmlv3.q4_K_M.bin
:warning::warning::warning::warning::warning:
Hi! I'm a bot running with LocalAI ( a crazy experiment of @mudler ) - please beware that I might hallucinate sometimes!
but.... I can also be funny or helpful :smile_cat: and I can provide generally speaking good tips or places where to look after in the documentation or in the code based on what you wrote in the issue.
Don't engage in conversation with me, I don't support (yet) replying!
:warning::warning::warning::warning::warning:
I understand that you are trying to use the Lama Embedding feature of LocalAI, but you are encountering an error related to grpc Server/Client default parameters. You would like to have the possibility to change these parameters when initializing the server and clients.
To address this issue, you can modify the grpc.NewServer() function in the server.go file of the LocalAI codebase. This function is responsible for creating a new grpc Server object with default values for various parameters such as the maximum message size. You can modify these defaults by passing in custom options to the grpc.NewServer() function.
Here's an example of how you could modify the grpc.NewServer() function to allow for larger message sizes:
func NewServer(o *options.Option) (*grpc.Server, error) {
s := grpc.NewServer(grpc.DefaultServerOptions())
s.MaxReceiveMessageLength = 100000000 // increase max message size to 10MB
return s, nil
}
This modified function sets the MaxReceiveMessageLength parameter of the grpc Server object to 10MB, which should be sufficient for most embedding use cases. You can adjust this value as needed based on your specific requirements.
To use this modified function in your code, you would need to modify the StartServer() function in your main program to pass in the custom options object returned by NewServer(). Here's an example of how you could modify the StartServer() function:
func main() {
o := options.New()
s, err := NewServer(o)
if err != nil {
panic(err)
}
grpc.StartServer(*addr, s)
}
This modified code creates a new options.Option object with any desired customizations, then calls the NewServer() function to create a new grpc Server object with those options. Finally, it starts the server using the grpc.StartServer() function and the newly created server object.
I hope this helps you resolve your issue with the Lama Embedding feature of LocalAI! Let me know if you have any further questions or concerns.
Sources:
- https://github.com/go-skynet/LocalAI/tree/master/api/backend/embeddings.go
- https://github.com/go-skynet/LocalAI/tree/master/cmd/grpc/bert-embeddings/main.go
- https://github.com/go-skynet/LocalAI/tree/master/pkg/grpc/proto/backend_grpc.pb.go
- https://github.com/go-skynet/LocalAI/tree/master/cmd/grpc/rwkv/main.go
This gives me some idea on how to use MaxReceiveMessageLength. For examples one could add s.MaxReceiveMessageLength = 100000000 in line 176 of https://github.com/go-skynet/LocalAI/blob/master/pkg/grpc/server.go could be an option.
I don't know how settings for grpc are handled in general, but it would be nice to have some changable options once the package is installed.
I am also not quite sure if just adding MaxReceiveMessageLength to NewServer is correct. I would expect that the clients need to be adapted as well
This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 5 days.
This issue was closed because it has been stalled for 5 days with no activity.