devika Ollama is not recognised by Devika on my local machine

python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible
Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:
Avoid using tokenizers before the fork if possible
Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}

24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}

Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate'

I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py

        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }

$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"

Apr 01 '24 16:04 kiran-chinthala

So you did ollama serve and then tried to run python3 devika.py or reverse?

Apr 01 '24 17:04 obliviousz

So you did ollama serve and then tried to run python3 devika.py or reverse?

Yes, I did the same,First Ollama serve and then I have started devika server.

Apr 01 '24 19:04 kiran-chinthala

Does

Ollama run llama2

work in your machine?

Apr 02 '24 13:04 obliviousz

Does

Ollama run llama2

work in your machine?

Yes it does, below is the command & its output

$ ollama run llama2

tell me a joke Sure, here's one:

Why don't scientists trust atoms? Because they make up everything!

I hope that brought a smile to your face!

Send a message (/? for help)

Apr 02 '24 15:04 kiran-chinthala

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!

docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running

Apr 02 '24 18:04 ShiFangJuMie

There was a change to src/llm/ollama_client.py in commit 7cd567b . Could it be that change? I am unsure if it was working previously though.

Apr 03 '24 02:04 alexdodson

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running

I think that ollama with docker will only work if you run it using the docker-compose file

Apr 03 '24 13:04 samuelbirocchi

you have to update ollama url if it's not the default one

python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"}

24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2}

24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3}

Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate'

I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"

as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem

Apr 03 '24 19:04 ARajgor

https://github.com/stitionai/devika/issues/300

Apr 03 '24 19:04 ARajgor

24.04.02 18:34:11: root: INFO : Initializing Devika... 24.04.02 18:34:11: root: INFO : Initializing Prerequisites Jobs... 24.04.02 18:34:18: root: INFO : Loading sentence-transformer BERT models... 24.04.02 18:34:23: root: INFO : BERT model loaded successfully. 24.04.02 18:34:26: root: WARNING: Ollama not available 24.04.02 18:34:26: root: WARNING: run ollama server to use ollama models otherwise use other models 24.04.02 18:34:28: root: INFO : Devika is up and running!
docker exec -it devika-devika-backend-engine-1 bash
nonroot@60c02bb85332:~$ curl -f http://ollama:11434
Ollama is running
I think that ollama with docker will only work if you run it using the docker-compose file

Ollama and Devika are on the same network, and can be accessed from the Devika container using curl

Apr 04 '24 03:04 ShiFangJuMie

you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem

Couple of Questions:

I don't want to use docker to configure the application, I am trying to start the BE server from code. Then what is the solution to recognise local ollama server.
I have manually added these entries in the llm.py file, it is not fetch from the ollama server.

Apr 04 '24 07:04 kiran-chinthala

Ollama and Devika are on the same network, and can be accessed from the Devika container using curl

I agree with your point, if those are on same network it identifies. But incase I wanted to try using code, not the docker engine. How can I combine these two entities as one. Devika code + ollama server (separately installed). Please quote me on this scenario. Thanks

Apr 04 '24 07:04 kiran-chinthala

has anyone had a solution yet?

Apr 06 '24 05:04 ChanghongYangR

you have to update ollama url if it's not the default one
python3 devika.py 24.04.01 18:39:41: root: INFO : Initializing Devika... 24.04.01 18:39:41: root: INFO : Initializing Prerequisites Jobs... 24.04.01 18:39:41: root: INFO : Loading sentence-transformer BERT models... 24.04.01 18:39:41: root: INFO : BERT model loaded successfully. 24.04.01 18:39:41: root: WARNING: Ollama not available 24.04.01 18:39:41: root: WARNING: run ollama server to use ollama models otherwise use other models huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks... To disable this warning, you can either:

Avoid using tokenizers before the fork if possible

Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false) 24.04.01 18:39:41: root: INFO : Devika is up and running! 24.04.01 18:39:41: root: INFO : /api/data GET 24.04.01 18:39:41: root: DEBUG : /api/data GET - Response: {"models":{"CLAUDE":[["Claude 3 Opus","claude-3-opus-20240229"],["Claude 3 Sonnet","claude-3-sonnet-20240229"],["Claude 3 Haiku","claude-3-haiku-20240307"]],"GOOGLE":[["Gemini 1.0 Pro","gemini-pro"]],"GROQ":[["GROQ Mixtral","mixtral-8x7b-32768"],["GROQ LLAMA2 70B","llama2-70b-4096"],["GROQ GEMMA 7B IT","gemma-7b-it"]],"MISTRAL":[["Mistral 7b","open-mistral-7b"],["Mistral 8x7b","open-mixtral-8x7b"],["Mistral Medium","mistral-medium-latest"],["Mistral Small","mistral-small-latest"],["Mistral Large","mistral-large-latest"]],"OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]],"OPENAI":[["GPT-4 Turbo","gpt-4-0125-preview"],["GPT-3.5","gpt-3.5-turbo-0125"]]},"projects":[],"search_engines":["Bing","Google","DuckDuckGo"]}

24.04.01 18:39:41: root: INFO : /api/create-project POST 24.04.01 18:39:41: root: DEBUG : /api/create-project POST - Response: {"message":"Project created"} 24.04.01 18:39:41: root: INFO : /api/get-agent-state POST 24.04.01 18:39:41: root: DEBUG : /api/get-agent-state POST - Response: {"state":null} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:49: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:49: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":1} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":2} 24.04.01 18:39:50: root: INFO : /api/calculate-tokens POST 24.04.01 18:39:50: root: DEBUG : /api/calculate-tokens POST - Response: {"token_usage":3} Exception in thread Thread-3 (): Traceback (most recent call last): File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 1038, in _bootstrap_inner self.run() File "/Users/kiran/miniconda3/lib/python3.11/threading.py", line 975, in run self._target(*self._args, **self._kwargs) File "/Users/kiran/Documents/GitHub/devika/devika.py", line 94, in thread = Thread(target=lambda: agent.execute(message, project_name, search_engine)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/agent.py", line 263, in execute plan = self.planner.execute(prompt, project_name_from_user) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/agents/planner/planner.py", line 70, in execute response = self.llm.inference(prompt, project_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/llm.py", line 98, in inference response = model.inference(self.model_id, prompt).strip() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/kiran/Documents/GitHub/devika/src/llm/ollama_client.py", line 20, in inference response = self.client.generate( ^^^^^^^^^^^^^^^^^^^^ AttributeError: 'NoneType' object has no attribute 'generate' I have updated the /Users/kiran/Documents/GitHub/devika/src/llm/llm.py
        "OLLAMA": [
            ("mistral", "mistral"),
            ("llama2", "llama2"),
        ]
    }
$ ollama serve time=2024-04-01T18:35:10.822+02:00 level=INFO source=images.go:860 msg="total blobs: 33" time=2024-04-01T18:35:10.852+02:00 level=INFO source=images.go:867 msg="total unused blobs removed: 0" time=2024-04-01T18:35:10.854+02:00 level=INFO source=routes.go:995 msg="Listening on 127.0.0.1:11434 (version 0.1.23)" time=2024-04-01T18:35:10.854+02:00 level=INFO source=payload_common.go:106 msg="Extracting dynamic libraries..." time=2024-04-01T18:35:10.870+02:00 level=INFO source=payload_common.go:145 msg="Dynamic LLM libraries [metal]"
as in your log it has OLLAMA":[["mistral","mistral:latest"],["llama2","llama2:latest"]] means it connect to ollama and fetch models. so what's the problem
Couple of Questions:

I don't want to use docker to configure the application, I am trying to start the BE server from code. Then what is the solution to recognise local ollama server.

I have manually added these entries in the llm.py file, it is not fetch from the ollama server.

share the list of models in the Ollama terminal. cause if there are any Ollama models present then the lib automatically fetches the models. Also are you using latest version of ollama?

Apr 06 '24 05:04 ARajgor

Does this issue still persist? if so then can you run this code,

import ollama
client = ollama.Client()
print(client.list()["models"]

and in terminal ->

ollama # check if it's installed on your system
ollama list

Apr 20 '24 12:04 ARajgor