LibreChat Enhancement: Load balancing on Gemini API

Enhancement: Load balancing on Gemini API

Open msg7086 opened this issue 9 months ago • 0 comments

What features would you like to see added?

As shown here https://github.com/danny-avila/LibreChat/blob/94eeec354e5a75a5604f61665cddd3d1afbc76f5/api/app/clients/GoogleClient.js#L24

We only use us-central1 endpoint, which puts query stress all over us-central1 servers, and also put users under the query limit of 1-2 queries per minute per region.

It would be great if you can load balancing this over all regions endpoints, to better spread the stress and also to get around with per region query limits.

More details

Due to quota Generate content requests per minute per project per base model per minute per region per base_model, the amount of requests is limited by per minute per region per base model, and the limit is usually 1. This will be used up very quickly if you are having a conversation with short sentences with Gemini.

Many other regions provide the same capabilities.

['us-west1', 'us-west4', 'us-central1', 'us-south1', 'us-east4', 'northamerica-northeast1', 'europe-central2', 'europe-west1', 'europe-west2', 'europe-west3', 'europe-west4', 'europe-west6', 'asia-east1', 'asia-east2', 'asia-south1', 'asia-northeast1', 'asia-northeast3', 'australia-southeast1'] (may not be a complete list)

We can utilize all of them, and if possible, give the users the ability to override which regions to use from .env file.

We can pick regions randomly, or we can do LRU. The goal is to put query stress evenly on all Google regions, and have a much lower chance to hit quota limit and get an error.

Which components are impacted by your request?

Endpoints

Pictures

No response

Code of Conduct

[X] I agree to follow this project's Code of Conduct

May 14 '24 21:05 msg7086

LibreChat LibreChat copied to clipboard

Enhancement: Load balancing on Gemini API

What features would you like to see added?

More details

Which components are impacted by your request?

Pictures

Code of Conduct

LibreChat
LibreChat copied to clipboard