WrenAI
WrenAI copied to clipboard
Allow using locally-running OpenAI API-compatible service
Is your feature request related to a problem? Please describe. We have Ollama and Jan.ai running locally and want to use them instead of OpenAI for data privacy reasons.
Describe the solution you'd like Please allow adding URL to the configuration. If the services conform to OpenAI API the rest fo the code should work.
Describe alternatives you've considered IP forwarding but it's clunky.
Additional context Other similar services (e.g. CodeGPT) allow custom URLs to the LLM services
Thanks for raising the issue and suggest a great solution for supporting other LLM services! We'll definitely take a look and think about it.
@igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct?
Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.
I have created a branch for this issue: https://github.com/Canner/WrenAI/tree/feature/ai-service/changing-providers
However, I think one issue we need to tackle first is maybe we should allow community members more easily use their preferred embedding models. As of now, we only use OpenAI's embedding models. There should be three things that community members would like to change on their own, namely generators, vector databases, embedding models. And one caveat in our design as of now is that generators and embedding models should be the same LLM provider such as OpenAI.
What's your thoughts about it?
Apropros of this, support for defog's SqlCoder would be nice
I think Llama does support it:OpenAI compatibility · Ollama Blogollama.comOn May 20, 2024, at 4:48 PM, Chih-Yu Yeh @.***> wrote: @igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct? Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.
—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>
@igor-elbert @ccollie I've tested Ollama's text generation model and embedding model support for OpenAI API compatibility. The result is that the embedding model can't be used using OpenAI API. Please check out the attached gist url for reproduction. Please correct me if I am wrong, thanks.
https://gist.github.com/cyyeh/b1042006b4ca067f2a75abd97e3749fb
Unfortunately I don't (yet) have an ollama setup. However, I did find this which led me to believe this is possible
Sorry, I misread the gist (re embeddings), but as far as the Ollama docs go, they only have support currently for the chat completions api
@igor-elbert @ccollie
Hi, we just refined how you can add your preferred LLM and Document Store. You only need to define LLM and Document Store and their environment variables! For details please check out the detailed guide here: https://docs.getwren.ai/installation/custom_llm
For adding ollama, I've created a branch for this and made a minimal implementation, feel free to check it out: https://github.com/Canner/WrenAI/tree/feature/ai-service/add-ollama
ONE CAVEAT: after you define your own LLM, you may find ai pipelines break, that's because your LLM may not suitable for the prompts, so you need to do prompt engineering at the moment. In the future, we'll come up with ways for you to easily extend and customize your prompts. Or welcome to share your thoughts here. As of now, I suppose prompts and respective LLM should match to have the best performance.
If there are no more issues, I'll close this issue then. Thank you :)
You might also be interested in these models that you can run locally and they can generate SQL:
https://ollama.com/library/sqlcoder https://ollama.com/library/codeqwen https://ollama.com/library/starcoder2
We'll merge the add-ollama branch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer
Might also be useful for this project since WrenAi has DuckDB already. duckdb-nsql model
We'll merge the
add-ollamabranch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer
To enforce some format you might need a model that supports function calling such as Mistral 7B v0.3. please note this model might not be particularly powerful with SQL generation. @cyyeh
FYI, there are two most popular inference engines: Ollama (partly compatible with OpenAI APIs), mostly using its own APIs. and LocalAI (Tend to be almost fully compatible with OpenAI APIs)
I would suggest using LiteLLM framework that can augment different LLM providers, and make it easier to maintain and add new once.
all, the ollama has been integrated in this branch, also you can use openai api compatible llm: chore/ai-service/update-env
we'll merge this branch to the main branch in the near future and update the documentation
as of now, I'll delete the original ollama branch
Thank you all for your patience
related pr: https://github.com/Canner/WrenAI/pull/376
All, we now support using Ollama and OpenAI API-compatible LLMs now with the latest release: https://github.com/Canner/WrenAI/releases/tag/0.6.0
Setups on how to run Wren AI using custom LLMs: https://docs.getwren.ai/installation/custom_llm#running-wren-ai-with-your-custom-llm-or-document-store
Currently, there is one obvious limitation for custom LLMs: you need to use the same provder(such as OpenAI, or Ollama) for LLM and embedding model. We'll fix that and release a new version soon. Stay tuned 🙂
I'll close this issue as completed now.