Javier Martinez
Javier Martinez
In the last ollama version (v.2.0), you can select how many concurrent requests and parallel you want. Please check this [article](https://www.linkedin.com/pulse/enhanced-concurrency-features-ollamas-latest-update-robyn-le-sueur-bh7rf) :)
Fixed in https://github.com/zylon-ai/private-gpt/pull/1954
@neofob could you open a PR with these changes? It sounds really useful :)
Have you tried if it was a virtualenv error? I use `venv` and I don't have any problem. Also, checking your case, remember that when you install a new extra...
Is it issue
You'll need a `HF_TOKEN` in order to download `mistral` tokenizer. Please check: https://huggingface.co/docs/hub/en/models-gated Let's try to find a solution to avoid having to download something when it is not necessary...
Right now, we cannot modify the embedding context using Ollama. You will have to configure it by modifying `embedding_component` as this PR modified `llm_component` respectively: https://github.com/zylon-ai/private-gpt/pull/1703/files#diff-d1cc2631298b50677e869ca40d96be3d748f912661d694b916f3a99b5827fdf9R118
Fixed on #1792
Remember that if you decide to use another LLM model in ollama, you have to pull before. `ollama pull llama3` After downloading, be sure that Ollama is working as expected....
As @dpedwards said, you can't use metal with docker. The easiest and simplest way to use on a Mac, is `ollama` running on top of host and `PrivateGPT` using any...