kotaemon [BUG] - Ollama OpenAI not working

Description

When using LLM openai interface directly, everything else is available except for chat of GRAPHRAG Collection , which is not available. there is the error: /home/master/.local/lib/python3.10/site-packages/datashaper/engine/verbs/convert.py:72: UserWarning: Could not infer format, so each element will be parsed individually, falling back to dateutil. To ensure parsing is consisten File "/home/master/.local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run result = context.run(func, *args) File "/home/master/.local/lib/python3.10/site-packages/gradio/utils.py", line 818, in wrapper response = f(*args, **kwargs) File "/mnt/c/workplace/Pycharm-workplcae/kotaemon/libs/ktem/ktem/pages/chat/init.py", line 704, in message_selected return retrieval_history[index], plot_history[index] IndexError: list index out of range

But deploying local models qwen2 and nomic-embed-text using ollama,there is the error:

openai.APIConnectionError: Connection error.

Reproduction steps

1. The local model interface can be successfully called using OpenAI
2. The. env file has been set to ollama
'''
GRAPHRAG_API_KEY="ollama"
GRAPHRAG_API_BASE="http://localhost:11434/v1/"
GRAPHRAG_LLM_API_KEY="ollama"
GRAPHRAG_LLM_API_BASE="http://localhost:11434/v1/"
GRAPHRAG_LLM_MODEL="qwen2"
GRAPHRAG_EMBEDDING_API_KEY="ollama"
GRAPHRAG_EMBEDDING_API_BASE="http://localhost:11434/v1/"
GRAPHRAG_EMBEDDING_MODEL="nomic-embed-text"
OPENAI_API_BASE="http://localhost:11434/v1/"
OPENAI_API_KEY="ollama"
OPENAI_CHAT_MODEL="qwen2"
OPENAI_EMBEDDINGS_MODEL="nomic-embed-text"
'''
3. ./ktem_app_data/user_data/files/graphrag/xxxx/settings.yaml has been fixed using GRAPHRAG_LLM_MODEL, GRAPHRAG_EMBEDDING_MODEL by modifying the code
'''
llm:
  api_key: ${GRAPHRAG_API_KEY}
  type: openai_chat
  model: qwen2
  model_supports_json: true
  api_base: http://localhost:11434/v1/
parallelization:
  stagger: 0.3
async_mode: threaded
embeddings:
  async_mode: threaded
  llm:
    api_key: ${GRAPHRAG_API_KEY}
    type: openai_embedding
    model: nomic-embed-text
    api_base: http://localhost:11434/v1/
'''
4. According to local_madel.md, the configuration on the page has also been changed to ollama

Screenshots

No response

Logs

14:56:54,552 root ERROR error extracting graph
Traceback (most recent call last):
  File "/home/master/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 72, in map_httpcore_exceptions
    yield
  File "/home/master/.local/lib/python3.10/site-packages/httpx/_transports/default.py", line 377, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 216, in handle_async_request
    raise exc from None
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 196, in handle_async_request
    response = await connection.handle_async_request(
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_async/connection.py", line 99, in handle_async_request
    raise exc
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_async/connection.py", line 76, in handle_async_request
    stream = await self._connect(request)
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_async/connection.py", line 122, in _connect
    stream = await self._network_backend.connect_tcp(**kwargs)
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_backends/auto.py", line 30, in connect_tcp
    return await self._backend.connect_tcp(
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_backends/anyio.py", line 114, in connect_tcp
    with map_exceptions(exc_map):
  File "/home/master/anaconda3/envs/kotae/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/master/.local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc) from exc
httpcore.ConnectError: All connection attempts failed

Browsers

Chrome

OS

Windows

Additional information

No response

Sep 05 '24 09:09 bricklayer-co

"Why am I using this configuration:

makefile Copy code api_key: ollama base_url: http://localhost:11434/v1/ model: llama3.1:8b Test as follows:

Testing model: ollama Sending a message: Hi Connection failed. Got error: Error code: 400 - {'error': {'message': 'unexpected EOF', 'type': 'invalid_request_error', 'param': None, 'code': None}}. When testing with http://localhost:11434/ in the browser, it shows 'Ollama is running', but I get an error when setting it up. Why is this happening?"**

Sep 06 '24 02:09 gogodechen1

"Why am I using this configuration:

makefile Copy code api_key: ollama base_url: http://localhost:11434/v1/ model: llama3.1:8b Test as follows:

Testing model: ollama Sending a message: Hi Connection failed. Got error: Error code: 400 - {'error': {'message': 'unexpected EOF', 'type': 'invalid_request_error', 'param': None, 'code': None}}. When testing with http://localhost:11434/ in the browser, it shows 'Ollama is running', but I get an error when setting it up. Why is this happening?"**

I use the same setup, but the error is different.

Connection failed. Got error: Error code: 404 - {'error': {'message': 'model "llama3.1:8b" not found, try pulling it first', 'type': 'api_error', 'param': None, 'code': None}}

And the same error insists even I changed the parameters. It seems it just don't read settings here.

And at the same time, the settings on Embeddings Tab (also ollama) is work

Sep 06 '24 06:09 yaboyang

"Why am I using this configuration: makefile Copy code api_key: ollama base_url: http://localhost:11434/v1/ model: llama3.1:8b Test as follows: Testing model: ollama Sending a message: Hi Connection failed. Got error: Error code: 400 - {'error': {'message': 'unexpected EOF', 'type': 'invalid_request_error', 'param': None, 'code': None}}. When testing with http://localhost:11434/ in the browser, it shows 'Ollama is running', but I get an error when setting it up. Why is this happening?"**

I use the same setup, but the error is different.

Connection failed. Got error: Error code: 404 - {'error': {'message': 'model "llama3.1:8b" not found, try pulling it first', 'type': 'api_error', 'param': None, 'code': None}}

And the same error insists even I changed the parameters. It seems it just don't read settings here.

And at the same time, the settings on Embeddings Tab (also ollama) is work

Guys, maybe u never find the real reason. Firstly, the settings on UI is not work at all. Secondly, it reports not found as I pull without version thus it's with tag latest. So I pull with :8b then it's successed, though there nothing need to download.

Sep 06 '24 09:09 yaboyang

change model: llama3.1:8b to model: llama3.1:latest

Sep 06 '24 13:09 munkito

In the case of using Docker image, please replace http://localhost with http://host.docker.internal to correctly communicate with service on the host machine.

Sep 09 '24 06:09 WeipengMO

In the case of using Docker image, please replace http://localhost with http://host.docker.internal to correctly communicate with service on the host machine.

@WeipengMO nice!!!

also on docker

api_key: ollama
base_url: http://host.docker.internal:11434/v1/
model: phi3.5:latest

Have Ollama run phi3:latest in terminal left top Have Docker running fine able to get into Kotaemon

Not sure :crying_cat_face:

Sep 09 '24 06:09 Niko-La

Hi @Niko-La , I am running Docker on WSL2, following the steps outlined in this link under the "Use Local Models for RAG" section.

I hope this information helps.

Sep 09 '24 07:09 WeipengMO

@WeipengMO im on ubuntu so had to use the host's actual IP address instead of host.docker.internal.

and

Modify Ollama Configuration Ollama is running on the host machine, ensure it's listening on all interfaces, not just localhost. Add this to your Ollama service file:

Environment="OLLAMA_HOST=0.0.0.0:11434"

thx

nikola@nikola:~/Downloads/kotaemon$ ollama list
NAME                   	ID          	SIZE  	MODIFIED       
nomic-embed-text:latest	0a109f422b47	274 MB	2 minutes ago 	
gemma2:2b              	8ccf136fdd52	1.6 GB	38 minutes ago	
phi3.5:latest          	3b387c8dd9b7	2.2 GB	2 weeks ago   	
llama3:instruct        	a6990ed6be41	4.7 GB	4 months ago

Local LLm and Embeddings now working using this setup @WeipengMO thx for your support

Sep 09 '24 08:09 Niko-La

Still can not connect with ollama embedding even host.docker.internal or 0.0.0.0 or 127.0.0.1 or the host's actual IP (192.168....) :(

I am using kotaemon:latest. Someone helps me !

Oct 16 '24 04:10 duongkstn

Still can not connect with ollama embedding even host.docker.internal or 0.0.0.0 or 127.0.0.1 or the host's actual IP (192.168....) :(

I am using kotaemon:latest. Someone helps me !

I was having the same issue. The Kotaemon app was working standalone but I switched to the docker image and it could not connect to my Ollama server. I changed nothing about my Ollama server; just the Kotaemon app.

It appears to be an issue with the docker runtime making the host internal call to Ollama.

Here's my troubleshooting steps: docker run --add-host=host.docker.internal:host-gateway -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -it --rm ghcr.io/cinnamon/kotaemon:main-full

Notice I've added the host-gateway parameter per the StackExchange recommendation.

I grab the Kotaemon app docker image ID:

docker ps

And connect direclty to the Kotaemon docker runtime:

docker exec -it [RUNTIME_ID_FROM_PREVIOUS_COMMAND] /bin/bash

Now that I'm within the app, curl to the Ollama server's API endpoint fails:

root@0e64b0f6d93b:/app# curl host.docker.internal:11434
curl: (7) Failed to connect to host.docker.internal port 11434 after 0 ms: Couldn't connect to server

To avoid Docker "magic" about hostname lookup, I switched to calling my desktop's IP address directly where Ollama is running:

root@0e64b0f6d93b:/app# curl 192.168.0.221:11434
curl: (7) Failed to connect to 192.168.0.221 port 11434 after 0 ms: Couldn't connect to server

So the failure is happening within the Kotaemon app runtime to the host. This is strange because it's an outbound call from the Docker runtime. A public website responds to the Docker runtime, like below:

root@0e64b0f6d93b:/app# curl https://qrenco.de/

Hmmm ... well then I enabled host networking for Docker image and switched back to calling localhost within the app runtime. Like this:

docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 --net=host -it --rm ghcr.io/cinnamon/kotaemon:main-full
WARNING: Published ports are discarded when using host network mode

...

root@justin-two-towers:/app# curl localhost:11434
Ollama is running

TL;DR add host networking to the Kotaemon app docker runtime as the parameter --net=host. Keep the app's Resources->LLM set to the default localhost:11434. Do NOT reconfigure the Kotaemon app to point to host.docker.internal. (Why does this work? My Docker deep dive classes are dusty these days but it has to do with internal communication within the virtual network created by the docker service.)

Ollama then replies to the Test Connection:

Testing model: ollama
Sending a message Hi
Connection success. Got response: It's nice to meet you. Is there something I can help you with, or would you like to chat?

Nov 18 '24 14:11 vap0rtranz

FYI on net=host usage: https://stackoverflow.com/questions/36840552/docker-cannot-connect-to-the-host-machine

Nov 18 '24 14:11 vap0rtranz

I am running Ollama on macos and kotaemon using docker. I am using llama3.2 and when testing I am getting error - Connection failed. Got error: Connection error. I tried above solutions but none working for me.

Jan 13 '25 19:01 varunyn