khoj
khoj copied to clipboard
[FIX] Unable to connect to local Ollama using self-hosted Docker
Describe the bug
I'm trying to connect to my local ollama server but am getting the following connection_exceptions (too large to fit inside the issue) after getting these DEBUG messages inside the server logs:
server-1 | [22:43:54.017709] DEBUG khoj.processor.conversation.openai before_sleep.py:65
server-1 | .utils: Retrying
server-1 | khoj.processor.conversation.openai
server-1 | .utils.completion_with_backoff in
server-1 | 0.6096594730061932 seconds as it
server-1 | raised APIConnectionError:
server-1 | Connection error..
server-1 | [22:43:55.932440] DEBUG khoj.processor.conversation.openai before_sleep.py:65
server-1 | .utils: Retrying
server-1 | khoj.processor.conversation.openai
server-1 | .utils.completion_with_backoff in
server-1 | 0.6813156899799166 seconds as it
server-1 | raised APIConnectionError:
server-1 | Connection error..
server-1 | [22:43:58.039865] DEBUG khoj.routers.helpers: Chat actor: helpers.py:195
server-1 | Infer information sources to refer:
server-1 | 5.843 seconds
I've tried to setup the OPENAI_API_BASE variable to the two following values inside the docker-compose file:
- OPENAI_API_BASE=http://host.docker.internal:11434/v1/
- OPENAI_API_BASE=http://localhost:11434/v1/
(and set the same URLs for the AI model API later in the server admin panel).
To Reproduce
Steps to reproduce the behavior:
- Install ollama
- Execute
ollama servethanollama pull llama3.1 - Launch
docker-compose up, and configure the AI model API and Chat Model as stated in the doc. - Choose the model that has been configured, and try chatting.
Screenshots
API config
Models config
I tried with the names 8b and latest, since the output of ollama list yields
NAME ID SIZE MODIFIED
llama3.1:latest 46e0c10c039e 4.9 GB 2 hours ago
Platform
- Server:
- [ ] Cloud-Hosted (https://app.khoj.dev)
- [X] Self-Hosted Docker
- [ ] Self-Hosted Python package
- [ ] Self-Hosted source code
- Client:
- [ ] Obsidian
- [ ] Emacs
- [ ] Desktop app
- [X] Web browser
- OS:
- [ ] Windows
- [ ] macOS
- [X] Linux
- [ ] Android
- [ ] iOS
If self-hosted
- Server Version: v1.36.0
Having a similar problem. Was initially using windows and I reproduced the same thing on Linux using docker in conjunction with ollama and LM Studio
Hi, Same here. The server container cannot connect to the host port 11434 on which Ollama listens. Any ideas ?
I thought it might have just been problems with windows being windows with WSL but exact same problem with Linux Mint.
There's got to be something that were just doing wrong.
@raphaelventura I've struggled a while but found what wasn't working on my setup (Linux ubuntu):
Be sure that the AI model API setup in the Admin Web Interface is set to http://host.docker.internal:11434/v1/ On my setup Ollama is working inside a Docker internal (either bare docker run or docker compose)
So based on your config shown in the screenshots above, prefer using your "ollama" API config and not "ollama-local"
Thanks @infocillasas , but I've tried both URLs (one after the other) in the yaml config file as well as the model API config page.
My ollama instance isn't running inside a container, its my distro's package application
This is a very open issue that I have been fighting for days. I can run this fine on my mac in docker, but when I install it on an ubuntu desktop instance with a GPU I see the same issues and error message. I am hosting ollama on the same machine and also trying with docker ( just like OP did).
I can easily connect to my local instance, but have no idea what this "/v1" is: curl http://192.168.6.241:11434/v1/ 404 page not found
curl http://192.168.6.241:11434 Ollama is running
the /v1/ endpoint is the API's and is the correct address to setup Khoj to connect to Ollama
This is a very open issue that I have been fighting for days. I can run this fine on my mac in docker, but when I install it on an ubuntu desktop instance with a GPU I see the same issues and error message. I am hosting ollama on the same machine and also trying with docker ( just like OP did).
I can easily connect to my local instance, but have no idea what this "/v1" is: curl http://192.168.6.241:11434/v1/ 404 page not found
curl http://192.168.6.241:11434 Ollama is running
Same issue here with docker and system ollama installation:
ollama pull llama3.2:latest
NOTICE: ollama run llama3.2 works fine
chat model
- name: llama3.2
- model type: openai
- model api: ollama
- max token: 800
model api
- name: ollama
- api key: placeholder
- base url: http://localhost:11434/v1/
Khoj
v1.36.6
ollama logs
The ollama server logs show nothing, meaning there is no HTTP access. Trying the ollama installation from openweb UI works fine with the expected logs in ollama.
I can easily connect to my local instance, but have no idea what this "/v1" is: curl http://192.168.6.241:11434/v1/ 404 page not found curl http://192.168.6.241:11434 Ollama is running
That's how the API is designed:
curl http://localhost:11434/v1gives 404curl http://localhost:11434/v1/modelslist the models
try with host.docker.internalinstead of localhost
Same issue here with docker and system ollama installation:
ollama pull llama3.2:latestNOTICE:
ollama run llama3.2works finechat model
- name: llama3.2
- model type: openai
- model api: ollama
- max token: 800
model api
- name: ollama
- api key: placeholder
- base url: http://localhost:11434/v1/
Khoj
v1.36.6
ollama logs
The ollama server logs show nothing, meaning there is no HTTP access. Trying the ollama installation from openweb UI works fine with the expected logs in ollama.
Sadly, this didn't help:
host> curl http://localhost:11434
Ollama is running%
host>
host> docker exec -it khoj-server-1 /bin/bash
root@633fd8c28278:/app# grep internal /etc/hosts
172.17.0.1 host.docker.internal
root@633fd8c28278:/app# curl http://host.docker.internal:11434
curl: (7) Failed to connect to host.docker.internal port 11434 after 0 ms: Connection refused
root@633fd8c28278:/app# curl https://8.8.8.8
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://dns.google/">here</A>.
</BODY></HTML>
root@633fd8c28278:/app#
host> docker network ls
NETWORK ID NAME DRIVER SCOPE
a33f5632af16 bridge bridge local
d242b17567d6 host host local
2e266333f8e6 khoj_default bridge local
09a506d21dcf none null local
host> brctl show
bridge name bridge id STP enabled interfaces
br-2e266333f8e6 8000.02424dc296a0 no veth05c3791
veth7f1269e
vetha75db8c
vethe6550e3
docker0 8000.0242f0873c9b no
host> ip a
<...>
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
link/ether 02:42:f0:87:3c:9b brd ff:ff:ff:ff:ff:ff
inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
valid_lft forever preferred_lft forever
6: br-2e266333f8e6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
link/ether 02:42:4d:c2:96:a0 brd ff:ff:ff:ff:ff:ff
inet 172.18.0.1/16 brd 172.18.255.255 scope global br-2e266333f8e6
valid_lft forever preferred_lft forever
<...>
There is still no access in the ollama logs...
BTW, I've tried to set OLLAMA_ORIGINS in the systemd service (restarted: option applied):
[Service]
Environment="OLLAMA_ORIGINS=http://172.18.0.2:*,http://172.18.0.3:*,http://172.18.0.4:*,http://172.18.0.5:*,http://172.18.0.2,http://172.18.0.3,http://172.18.0.4,http://172.18.0.5"
I still get "Connection refused" in the docker instance and no access in the logs of ollama.
It's now working, here. For short, the ollama service and network configuration did not match.
Detailed explanation (on Linux)
The docker config
The docker instance must be able to reach the ollama service running directly on the host.
The docker installation of khoj with compose is using the bridge mode of docker via the "docker0" bridge. The docker instances hosts file have the host.docker.internal hostname pointing to 172.17.0.1 (the ip of the bridge gateway).
The ollama config
Because of the bridge mode on linux, the ollama service MUST be configured to listen on the bridge address 172.17.0.1:11434. If not, the khoj app won't be able to reach the ollama service on the host.
It's expected to use host.docker.internal in your model api base url: http://host.docker.internal:11434/v1
Possible solutions
a) ollama listening on 172.17.0.1
For this to work, configure your service to listen on the correct address. For systemd, use systemctl edit ollama.service and add these lines to the config file (in the correct section):
### Anything between here and the comment below will become the contents of the drop-in file
[Service]
Environment="OLLAMA_HOST=172.17.0.1:11434"
Restart ollama:
systemctl daemon-reload
systemctl restart ollama.service
Downsides:
- any other client must be reconfigured to use this IP address
- the docker0 must be up before ollama gets started so it can bind on this address
b) ollama listening on all interfaces
Configure ollama to listen on 0.0.0.0:11434 by following the same steps above.
Downsides:
- the docker0 must be up before ollama gets started so it can bind on the bridge address
- ollama is now listening on ALL the interfaces so it can be reached from your networks, likely. This is insecure and additional firewall configuration is strongly recommended.
c) Redirect the traffic
The idea is to bind the docker0 on 11434 to the ollama service on localhost:11434.
IOW, we keep ollama to listen on localhost:11434.
With socat:
sudo socat TCP4-LISTEN:11434,bind=172.17.0.1,fork TCP:127.0.0.1:11434
NOTICE: something similar should be possible with nat PREROUTING and POSTROUTING iptables rules.
downsides:
- additional configuration is required for the full setup to work.
- the docker0 bridge must be up
Testing
The docker instance must be able to reach the ollama service like this:
host> docker exec -it khoj-server-1 /bin/bash
root@aa2ea544712e:/app# curl http://host.docker.internal:11434
Ollama is runningroot@aa2ea544712e:/app#
Notice the "Ollama is running".
Hope this helps.
Thanks very much for this detailed explanation.
I modified my ollama service with a new OLLAMA_HOST address and an ExecStartPre instruction in order to wait for the docker bridge to be up, as advised.
Unfortunately, I still have a connection error. It may be trivial to troubleshoot but I'm quite unfamiliar with network stuff. I tried to ping / curl ollama's URL using
docker-compose start server
docker-compose exec -it server /bin/bash
root@df6c8002d319:/app# curl http://host.docker.internal:11434
curl: (6) Could not resolve host: host.docker.internal
from the directory containing the docker-compose.yml official file.
From within the container, I can see that
root@df6c8002d319:/app# cat /etc/hosts
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
172.18.0.3 df6c8002d319
Seems like I'm missing some alias for host.docker.internal. How should I handle this?
I guess you didn't start all the docker stuff with docker-compose start server. Could you try with docker-compose up instead?
What's the output of
docker network lsdocker network inspect bridgebrctl show
Here are the 3 outputs in order after launching docker-compose up
❯ sudo docker network ls
NETWORK ID NAME DRIVER SCOPE
52d79ccd92a1 bridge bridge local
c84193da4b2a host host local
f57e0698514f khoj_default bridge local
07593506cb45 none null local
❯ sudo docker network inspect bridge
[
{
"Name": "bridge",
"Id": "52d79ccd92a118d389c4bb5c3adcc8fdeb5daed79a53226bdebb3f2a1ed7944e",
"Created": "2025-02-19T15:58:41.723093059+01:00",
"Scope": "local",
"Driver": "bridge",
"EnableIPv6": false,
"IPAM": {
"Driver": "default",
"Options": null,
"Config": [
{
"Subnet": "172.17.0.0/16",
"Gateway": "172.17.0.1"
}
]
},
"Internal": false,
"Attachable": false,
"Ingress": false,
"ConfigFrom": {
"Network": ""
},
"ConfigOnly": false,
"Containers": {},
"Options": {
"com.docker.network.bridge.default_bridge": "true",
"com.docker.network.bridge.enable_icc": "true",
"com.docker.network.bridge.enable_ip_masquerade": "true",
"com.docker.network.bridge.host_binding_ipv4": "0.0.0.0",
"com.docker.network.bridge.name": "docker0",
"com.docker.network.driver.mtu": "1500"
},
"Labels": {}
}
]
❯ brctl show
bridge name bridge id STP enabled interfaces
br-f57e0698514f 8000.02424e01dcd6 no veth0d73803
veth2b4770c
veth2da0e3f
veth6915c9d
docker0 8000.0242f5ad693f no
The ouputs looks good and the docker0 bridge with the correct ips et binding is there. Do you have the host.docker.internal line in /etc/hosts with docker-compose up?
No, I didn't!
I tried to add
extra_hosts:
- "host.docker.internal:host-gateway"
inside the docker-compose config, and then a corresponding line appeared inside the /etc/hosts and a communication was established between my running ollama service and khoj. But now I'm getting (probably unrelated) time out errors like these ones:
server-1 | [20:21:23.900044] DEBUG khoj.processor.conversation.openai before_sleep.py:65
server-1 | .utils: Retrying
server-1 | khoj.processor.conversation.openai
server-1 | .utils.completion_with_backoff in
server-1 | 0.3248539937731624 seconds as it
server-1 | raised APITimeoutError: Request
server-1 | timed out..
server-1 | [20:22:25.516722] DEBUG khoj.processor.conversation.openai before_sleep.py:65
server-1 | .utils: Retrying
server-1 | khoj.processor.conversation.openai
server-1 | .utils.completion_with_backoff in
server-1 | 0.49318227139203263 seconds as it
server-1 | raised APITimeoutError: Request
server-1 | timed out..
server-1 | [20:23:27.470012] DEBUG khoj.routers.api: Extracting search helpers.py:195
server-1 | queries took: 184.879 seconds
server-1 | [20:23:27.471631] ERROR khoj.routers.api_chat: Error api_chat.py:1005
server-1 | searching knowledge base: Request
server-1 | timed out.. Attempting to respond
server-1 | without document references.
I'll let this issue open for now since we had to go through some configuration that is not documented, so that may be of interest for contributors to look into.
Using Linux.
Tried a) and b) and still unfortunately still getting;
APIConnectionError: Connection error.
Pretty sure my config is right;
- OPENAI_BASE_URL=http://host.docker.internal:11434/v1/
I'm performing the following.
docker compose up
ollama serve
ollama run deepseek-r1:32b
After editing said file above and running;
➜ .khoj systemctl daemon-reload
systemctl restart ollama.service
If ollama service is running, you do not need to run ollam serve (check systemctl status ollama.service) and ollama run` afterwards. Khoj will call what it needs to.
Maybe run the checks suggested above ?
On February 20, 2025 12:27:44 PM GMT+01:00, Chris Franco @.***> wrote:
Francommit left a comment (khoj-ai/khoj#1100)
Using Linux.
Tried a) and b) and still unfortunately still getting;
APIConnectionError: Connection error.Pretty sure my config is right;
- OPENAI_BASE_URL=http://host.docker.internal:11434/v1/
I'm performing the following.
docker compose up ollama serve ollama run deepseek-r1:32bAfter editing said file above and running;
➜ .khoj systemctl daemon-reload systemctl restart ollama.service-- Reply to this email directly or view it on GitHub: https://github.com/khoj-ai/khoj/issues/1100#issuecomment-2671221255 You are receiving this because you were mentioned.
Message ID: @.***>
I faced the exactly the same issues described here.
Environment:
- Ubuntu 24.4.02
- NVIDIA RTX 4070 TI Super
- NVIDIA-SMI 560.35.03 Driver Version: 560.35.03
- CUDA Version: 12.6
- Khoj v1.41.0
- Khoj setup: docker-compose.yaml + ollama running as system service
I did something similar to @Francommit and do not get connection errors anymore.
Steps:
- Check where ollama is listening and found out that it only accepts requests from localhost:
... which is the reason for:❯ ss -tuln | grep 11434 tcp LISTEN 0 4096 127.0.0.1:11434 0.0.0.0:*❯ docker exec -it khoj-server-1 bash root@8fe6c6cb95f2:/app# curl http://localhost:11434/v1/models curl: (7) Failed to connect to localhost port 11434 after 0 ms: Connection refused - ollama needs to accept requests from more than localhost, for simplicity from everywhere:
... enter:❯ sudo systemctl edit ollama.service [sudo] password for ...:
... and save[Service] Environment="OLLAMA_HOST=0.0.0.0:11434"Successfully installed edited file '/etc/systemd/system/ollama.service.d/override.conf'. - Restart the service and check if if it accepts more than localhost:
❯ sudo systemctl restart ollama ❯ ss -tuln | grep 11434 tcp LISTEN 0 4096 *:11434 *:* - Check if it is now available from within the khoj container:
❯ docker exec -it khoj-server-1 bash root@8fe6c6cb95f2:/app# curl http://host.docker.internal:11434/v1/models {"object":"list","data":[{"id":"llama3.1:8b-instruct-fp16","object":"model","created":1745992192,"owned_by":"library"},{"id":"llama3.1:8b","object":"model","created":1745989544,"owned_by":"library"}]}
So the issue is solved with that.
Apart from that, the ollama integration doesn't seem to respect my local models.
I have the following models downloaded:
❯ ollama list
NAME ID SIZE MODIFIED
llama3.1:8b-instruct-fp16 4aacac419454 16 GB 10 hours ago
llama3.1:8b 46e0c10c039e 4.9 GB 11 hours ago
I successfully integrated ollama into Khoj, but can't use these models directly.
- After configuring the "Ai model apis": Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1
- Adding the desired model to "Chat models": Name llama3.1:8b; Model type: offline; Ai model api: ollama
- ... and setting the model as default model for all cases in http://localhost:42110/server/admin/database/serverchatsettings/
- Opening a new chat (even after restarting the container), and entering a message results in
/app/src/khoj/processor/conversation/offline/utils.py:63(load_model_from_cache) which seems to expect a model name that can be split by/(and not the name that ollama uses). So I used hugging-face-like model names such as "meta-llama/Llama-3.1-8B"- But no success, when doing the same thing and setting meta-llama/Llama-3.1-8B as default for all chat cases, we get:# ... during /app/src/khoj/processor/conversation/offiline/utils.py:58 server-1 | ValueError: No file found in server-1 | meta-llama/Llama-3.1-8B that match server-1 | *Q4_K_M.gguf server-1 | server-1 | Available Files: server-1 | ["original", ".gitattributes", server-1 | "LICENSE", "README.md", server-1 | "USE_POLICY.md", "config.json", server-1 | "generation_config.json", server-1 | "model-00001-of-00004.safetensors", server-1 | "model-00002-of-00004.safetensors", server-1 | "model-00003-of-00004.safetensors", server-1 | "model-00004-of-00004.safetensors", server-1 | "model.safetensors.index.json", server-1 | "special_tokens_map.json", server-1 | "tokenizer.json", server-1 | "tokenizer_config.json", server-1 | "original/consolidated.00.pth", server-1 | "original/params.json", server-1 | "original/tokenizer.model"]
If I'm not missing something (please let me know), I can create another issue for that.
1. After configuring the ["Ai model apis"](http://localhost:42110/server/admin/database/aimodelapi/): Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1 2. Adding the desired model to ["Chat models"](http://localhost:42110/server/admin/database/chatmodel/): Name llama3.1:8b; Model type: offline; Ai model api: ollama
@btschwertfeger
I think you should not choose the offline Model type, but rather OpenAI since you're using the Ollama API which is OpenAI-compliant.
1. After configuring the ["Ai model apis"](http://localhost:42110/server/admin/database/aimodelapi/): Name: ollama; API key: None; API base URL: http://host.docker.internal:11434/v1 2. Adding the desired model to ["Chat models"](http://localhost:42110/server/admin/database/chatmodel/): Name llama3.1:8b; Model type: offline; Ai model api: ollama@btschwertfeger I think you should not choose the
offlineModel type, but ratherOpenAIsince you're using the Ollama API which is OpenAI-compliant.
Oh yes, that was the trick - Thank you!
None of the above solutions worked for me. I am using RAGFlow docker not khoj. But the problem is similar that is from within docker container unable to access the 172.17.0.1:11434 with ollama installed directly on wsl ubuntu 24.04 (ie not as a docker service). This is what I was getting earlier:
`ashok@ashok:~$ ollama serve
time=2025-09-02T13:35:00.313Z level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/ashok/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"`
Note the line OLLAMA_HOST=OLLAMA_HOST:http://127.0.0.1:11434
This is what I did, exported OLLAMA_HOST variable, as:
ashok@ashok:~$ export OLLAMA_HOST="http://0.0.0.0:11434"
And then start Ollama, as:
`ashok@ashok:~$ ollama serve
time=2025-09-02T13:38:50.176Z level=INFO source=routes.go:1331 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://0.0.0.0:11434 OLLAMA_INTEL_GPU:false OLLAMA_KEEP_ALIVE:5m0s OLLAMA_KV_CACHE_TYPE: OLLAMA_LLM_LIBRARY: OLLAMA_LOAD_TIMEOUT:5m0s OLLAMA_MAX_LOADED_MODELS:0 OLLAMA_MAX_QUEUE:512 OLLAMA_MODELS:/home/ashok/.ollama/models OLLAMA_MULTIUSER_CACHE:false OLLAMA_NEW_ENGINE:false OLLAMA_NEW_ESTIMATES:false OLLAMA_NOHISTORY:false OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:* app://* file://* tauri://* vscode-webview://* vscode-file://*] OLLAMA_SCHED_SPREAD:false ROCR_VISIBLE_DEVICES: http_proxy: https_proxy: no_proxy:]"`
Now, notice: OLLAMA_HOST:http://0.0.0.0:11434
This worked.
Closing now since this is a pretty old thread and things may have changed by now.
Most relevant answer was @nicolas33 : https://github.com/khoj-ai/khoj/issues/1100#issuecomment-2668093077