private-gpt
private-gpt copied to clipboard
Ingestion rate limiter Ollama ?
Is there a ingestion rate limiter setting in Ollama or in PrivateGPT ?
Ingestion of any document i limited to 2.07 s/it for generation of embeddings - equivalent of a load of 0-3% on a 4090 :(
Running vanilla Ollama:
llm_model: mistral embedding_model: nomic-embed-text
with
embedding: mode: ollama ingest_mode: pipeline count_workers: 32
Any chance to set the beast free ? Other than loading a large number of docs at the same time
/Peter