private-gpt Ingestion rate limiter Ollama ?

Ingestion rate limiter Ollama ?

Open Stego72 opened this issue 10 months ago • 0 comments

Is there a ingestion rate limiter setting in Ollama or in PrivateGPT ?

Ingestion of any document i limited to 2.07 s/it for generation of embeddings - equivalent of a load of 0-3% on a 4090 :(

Running vanilla Ollama:

llm_model: mistral embedding_model: nomic-embed-text

with

embedding: mode: ollama ingest_mode: pipeline count_workers: 32

Any chance to set the beast free ? Other than loading a large number of docs at the same time

/Peter

Apr 15 '24 08:04 Stego72