Ollama integration not working
🐛 Describe the bug
tried using the ollama example and i get openai.AuthenticationError: Error code: 401
here is the code. import os from embedchain import App
load llm configuration from config.yaml file
app = App.from_config(config_path="config.yaml") app.query("What is the capital of France?")
the config.yaml is here, and yep ollama is running and i have pulled llama2
llm: provider: ollama config: model: 'llama2' temperature: 0.5 top_p: 1 stream: true
Hm. +1 for me as well. I'm hoping to implement a completely local embed/query agent, so I'm hoping this can be integrated or someone could point me at how to build the integration for embedding via ollama?
Currently, using the default ollama config, the example still hits OpenAI API for embedding as the default (I think? see below) and thus throws this misleading stack trace.
test.py
from embedchain import App
app = App.from_config(config_path="ollama_embedchain.yaml")
print(app)
app.add('./docs/global_politics.pdf', data_type='pdf_file')
print(app.query("What century did we begin to see an extended definition of the political community?"))
ollama_embedchain.yaml
llm:
provider: ollama
config:
model: 'llama2'
temperature: 0.5
top_p: 1
stream: true
Stack Trace
Traceback (most recent call last):
File "/app/doc_search_test.py", line 7, in <module>
print(app.query("What century did we begin to see an extended definition of the political community?"))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/embedchain/embedchain.py", line 479, in query
contexts = self._retrieve_from_database(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/embedchain/embedchain.py", line 438, in _retrieve_from_database
contexts = self.db.query(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/embedchain/vectordb/chroma.py", line 217, in query
result = self.collection.query(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/chromadb/api/models/Collection.py", line 327, in query
valid_query_embeddings = self._embed(input=valid_query_texts)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/chromadb/api/models/Collection.py", line 633, in _embed
return self._embedding_function(input=input)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/chromadb/api/types.py", line 193, in __call__
result = call(self, input)
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/chromadb/utils/embedding_functions.py", line 188, in __call__
embeddings = self._client.create(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/resources/embeddings.py", line 113, in create
return self._post(
^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1200, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 889, in request
return self._request(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 965, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1013, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 965, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 1013, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/openai/_base_client.py", line 980, in _request
raise self._make_status_error_from_response(err.response) from None
openai.RateLimitError: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}
I noticed, though, that the embed options for embedchain do not include ollama? Maybe this error is happening because it defaults to openai?
I've taken my first stab here - will submit a proper pull request soon:
https://github.com/KeithHanson/embedchain/tree/ollama-embedding-provider
ollama_embedchain.yaml
llm:
provider: ollama
config:
model: 'eas/nous-hermes-2-solar-10.7b'
temperature: 0.5
top_p: 1
stream: true
embedder:
provider: ollama
config:
model: 'eas/nous-hermes-2-solar-10.7b'
base_url: 'http://host.docker.internal:11434'
doc_search_testing.py
from embedchain import App
app = App.from_config(config_path="ollama_embedchain.yaml")
print(app)
app.add('./docs/global_politics.pdf', data_type='pdf_file')
app.query("What century did we begin to see an extended definition of the political community?")
I got it working with streamlit in this code. make sure you install sentence-transformers and delete your original .embedchain folder
`import databutton as db import streamlit as st import os
st.subheader(" Create Instant ChatBot 🤖 using embedchain")
st.markdown(" Repo : Embdedchain")
import time import openai from embedchain import App
@st.cache_resource def botadd(URL): databutton_bot = App.from_config(config={ "llm": { "provider": "ollama", "config": { "model": "llama2", "temperature": 0.5, "top_p": 1, "stream": True } }, "embedder": { "provider": "huggingface", "config": { "model": "BAAI/bge-small-en-v1.5" } } }) # Embed Online Resources databutton_bot.add("web_page", URL) return databutton_bot
st.subheader(" Create Instant ChatBot 🤖 using embedchain")
st.markdown(" Repo : Embdedchain")
if "btn_state" not in st.session_state: st.session_state.btn_state = False
prompt = st.text_input( "Enter a URL: ", placeholder="https://docs.databutton.com/howto/store-and-load-faiss-vectordb", ) btn = st.button("Initialize Bot")
if btn or st.session_state.btn_state: st.session_state.btn_state = True databutton_bot = botadd(prompt) st.success("Bot Ready ☑️! ")
# Initialize chat history
if "messages" not in st.session_state:
st.session_state.messages = []
# Display chat messages from history on app rerun
for message in st.session_state.messages:
with st.chat_message(message["role"]):
st.markdown(message["content"])
# Accept user input
if prompt := st.chat_input("What is up?"):
# Add user message to chat history
st.session_state.messages.append({"role": "user", "content": prompt})
# Display user message in chat message container
with st.chat_message("user"):
st.markdown(prompt)
# Display assistant response in chat message container
with st.chat_message("assistant"):
message_placeholder = st.empty()
full_response = ""
assistant_response = databutton_bot.query(prompt)
# Simulate stream of response with milliseconds delay
for chunk in assistant_response.split():
full_response += chunk + " "
time.sleep(0.05)
# Add a blinking cursor to simulate typing
message_placeholder.markdown(full_response + "▌")
message_placeholder.markdown(full_response)
# Add assistant response to chat history
st.session_state.messages.append(
{"role": "assistant", "content": full_response}
)
else: st.info("Initiate a bot first!")`
"embedder": { "provider": "huggingface", "config": { "model": "BAAI/bge-small-en-v1.5" } }
That is the huggingface provider, not embedding via ollama, just fyi. This will indeed get it working, though it skips ollama for embedding and uses HF, even though Ollama DOES have embedding support.
Good point, i figured let me try just llm first then embedding second :)
I lost like 2 hrs
pls add to docs on ollama:
os.environ["OLLAMA_HOST"] = "http://127.0.0.1:11434"
and in config.yaml
something like
embedder: provider: ollama config: model: znbang/bge:small-en-v1.5-q8_0 base_url: http://127.0.0.1:11434
otherwise it will keep looking for embedder on openai