mem0 icon indicating copy to clipboard operation
mem0 copied to clipboard

Ollama integration not working

Open rapidarchitect opened this issue 1 year ago • 6 comments

🐛 Describe the bug

tried using the ollama example and i get openai.AuthenticationError: Error code: 401

here is the code. import os from embedchain import App

load llm configuration from config.yaml file

app = App.from_config(config_path="config.yaml") app.query("What is the capital of France?")

rapidarchitect avatar Jan 26 '24 14:01 rapidarchitect

the config.yaml is here, and yep ollama is running and i have pulled llama2

llm: provider: ollama config: model: 'llama2' temperature: 0.5 top_p: 1 stream: true

rapidarchitect avatar Jan 26 '24 14:01 rapidarchitect

Hm. +1 for me as well. I'm hoping to implement a completely local embed/query agent, so I'm hoping this can be integrated or someone could point me at how to build the integration for embedding via ollama?

Currently, using the default ollama config, the example still hits OpenAI API for embedding as the default (I think? see below) and thus throws this misleading stack trace.

test.py

from embedchain import App
app = App.from_config(config_path="ollama_embedchain.yaml")
print(app)

app.add('./docs/global_politics.pdf', data_type='pdf_file')

print(app.query("What century did we begin to see an extended definition of the political community?"))

ollama_embedchain.yaml

llm:
  provider: ollama
  config:
    model: 'llama2'
    temperature: 0.5
    top_p: 1
    stream: true

Stack Trace

Traceback (most recent call last):
 File "/app/doc_search_test.py", line 7, in <module>
   print(app.query("What century did we begin to see an extended definition of the political community?"))
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/embedchain/embedchain.py", line 479, in query
   contexts = self._retrieve_from_database(
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/embedchain/embedchain.py", line 438, in _retrieve_from_database
   contexts = self.db.query(
              ^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/embedchain/vectordb/chroma.py", line 217, in query
   result = self.collection.query(
            ^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/chromadb/api/models/Collection.py", line 327, in query
   valid_query_embeddings = self._embed(input=valid_query_texts)
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/chromadb/api/models/Collection.py", line 633, in _embed
   return self._embedding_function(input=input)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/chromadb/api/types.py", line 193, in __call__
   result = call(self, input)
            ^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/chromadb/utils/embedding_functions.py", line 188, in __call__
   embeddings = self._client.create(
                ^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/resources/embeddings.py", line 113, in create
   return self._post(
          ^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1200, in post
   return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 889, in request
   return self._request(
          ^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 965, in _request
   return self._retry_request(
          ^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1013, in _retry_request
   return self._request(
          ^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 965, in _request
   return self._retry_request(
          ^^^^^^^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1013, in _retry_request
   return self._request(
          ^^^^^^^^^^^^^^
 File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 980, in _request
   raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}

I noticed, though, that the embed options for embedchain do not include ollama? Maybe this error is happening because it defaults to openai?

KeithHanson avatar Feb 24 '24 19:02 KeithHanson

I've taken my first stab here - will submit a proper pull request soon:

https://github.com/KeithHanson/embedchain/tree/ollama-embedding-provider

ollama_embedchain.yaml

llm:
  provider: ollama
  config:
    model: 'eas/nous-hermes-2-solar-10.7b'
    temperature: 0.5
    top_p: 1
    stream: true

embedder:
  provider: ollama
  config:
    model: 'eas/nous-hermes-2-solar-10.7b'
    base_url: 'http://host.docker.internal:11434'

doc_search_testing.py

from embedchain import App

app = App.from_config(config_path="ollama_embedchain.yaml")
print(app)

app.add('./docs/global_politics.pdf', data_type='pdf_file')

app.query("What century did we begin to see an extended definition of the political community?")

KeithHanson avatar Feb 25 '24 01:02 KeithHanson

I got it working with streamlit in this code. make sure you install sentence-transformers and delete your original .embedchain folder

`import databutton as db import streamlit as st import os

st.subheader(" Create Instant ChatBot 🤖 using embedchain") st.markdown(" Repo : Embdedchain")

import time import openai from embedchain import App

@st.cache_resource def botadd(URL): databutton_bot = App.from_config(config={ "llm": { "provider": "ollama", "config": { "model": "llama2", "temperature": 0.5, "top_p": 1, "stream": True } }, "embedder": { "provider": "huggingface", "config": { "model": "BAAI/bge-small-en-v1.5" } } }) # Embed Online Resources databutton_bot.add("web_page", URL) return databutton_bot

st.subheader(" Create Instant ChatBot 🤖 using embedchain")

st.markdown(" Repo : Embdedchain")

if "btn_state" not in st.session_state: st.session_state.btn_state = False

prompt = st.text_input( "Enter a URL: ", placeholder="https://docs.databutton.com/howto/store-and-load-faiss-vectordb", ) btn = st.button("Initialize Bot")

if btn or st.session_state.btn_state: st.session_state.btn_state = True databutton_bot = botadd(prompt) st.success("Bot Ready ☑️! ")

# Initialize chat history
if "messages" not in st.session_state:
    st.session_state.messages = []

# Display chat messages from history on app rerun
for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

# Accept user input
if prompt := st.chat_input("What is up?"):
    # Add user message to chat history
    st.session_state.messages.append({"role": "user", "content": prompt})
    # Display user message in chat message container
    with st.chat_message("user"):
        st.markdown(prompt)
    # Display assistant response in chat message container
    with st.chat_message("assistant"):
        message_placeholder = st.empty()
        full_response = ""
        assistant_response = databutton_bot.query(prompt)

    # Simulate stream of response with milliseconds delay
    for chunk in assistant_response.split():
        full_response += chunk + " "
        time.sleep(0.05)
        # Add a blinking cursor to simulate typing
        message_placeholder.markdown(full_response + "▌")
    message_placeholder.markdown(full_response)
    # Add assistant response to chat history
    st.session_state.messages.append(
        {"role": "assistant", "content": full_response}
    )

else: st.info("Initiate a bot first!")`

rapidarchitect avatar Feb 25 '24 14:02 rapidarchitect

"embedder": { "provider": "huggingface", "config": { "model": "BAAI/bge-small-en-v1.5" } }

That is the huggingface provider, not embedding via ollama, just fyi. This will indeed get it working, though it skips ollama for embedding and uses HF, even though Ollama DOES have embedding support.

KeithHanson avatar Feb 26 '24 02:02 KeithHanson

Good point, i figured let me try just llm first then embedding second :)

rapidarchitect avatar Feb 26 '24 14:02 rapidarchitect

I lost like 2 hrs

pls add to docs on ollama:

os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434"

and in config.yaml

something like

embedder: provider: ollama config: model: znbang/bge:small-en-v1.5-q8_0 base_url: http://127.0.0.1:11434

otherwise it will keep looking for embedder on openai

golemus avatar Jun 08 '24 16:06 golemus