WrenAI icon indicating copy to clipboard operation
WrenAI copied to clipboard

Allow using locally-running OpenAI API-compatible service

Open igor-elbert opened this issue 1 year ago • 13 comments

Is your feature request related to a problem? Please describe. We have Ollama and Jan.ai running locally and want to use them instead of OpenAI for data privacy reasons.

Describe the solution you'd like Please allow adding URL to the configuration. If the services conform to OpenAI API the rest fo the code should work.

Describe alternatives you've considered IP forwarding but it's clunky.

Additional context Other similar services (e.g. CodeGPT) allow custom URLs to the LLM services

igor-elbert avatar May 20 '24 14:05 igor-elbert

Thanks for raising the issue and suggest a great solution for supporting other LLM services! We'll definitely take a look and think about it.

cyyeh avatar May 20 '24 14:05 cyyeh

@igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct?

Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.

cyyeh avatar May 20 '24 14:05 cyyeh

I have created a branch for this issue: https://github.com/Canner/WrenAI/tree/feature/ai-service/changing-providers

However, I think one issue we need to tackle first is maybe we should allow community members more easily use their preferred embedding models. As of now, we only use OpenAI's embedding models. There should be three things that community members would like to change on their own, namely generators, vector databases, embedding models. And one caveat in our design as of now is that generators and embedding models should be the same LLM provider such as OpenAI.

What's your thoughts about it?

cyyeh avatar May 20 '24 18:05 cyyeh

Apropros of this, support for defog's SqlCoder would be nice

ccollie avatar May 20 '24 20:05 ccollie

I think Llama does support it:OpenAI compatibility · Ollama Blogollama.comOn May 20, 2024, at 4:48 PM, Chih-Yu Yeh @.***> wrote: @igor-elbert There is another issue considering how do we support embedding models using other than OpenAI. As of now, I suppose we can't directly use Ollama's supported embedding models since they don't conform to OpenAI's api. Am I correct? Reference: https://ollama.com/blog/embedding-models. In the "Coming soon" section, OpenAI API Compatibility is one of the items.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

igor-elbert avatar May 20 '24 21:05 igor-elbert

I think Llama does support it:OpenAI compatibility

Yup. Found it here

ccollie avatar May 20 '24 22:05 ccollie

@igor-elbert @ccollie I've tested Ollama's text generation model and embedding model support for OpenAI API compatibility. The result is that the embedding model can't be used using OpenAI API. Please check out the attached gist url for reproduction. Please correct me if I am wrong, thanks.

https://gist.github.com/cyyeh/b1042006b4ca067f2a75abd97e3749fb

cyyeh avatar May 21 '24 00:05 cyyeh

Unfortunately I don't (yet) have an ollama setup. However, I did find this which led me to believe this is possible

ccollie avatar May 21 '24 00:05 ccollie

Sorry, I misread the gist (re embeddings), but as far as the Ollama docs go, they only have support currently for the chat completions api

ccollie avatar May 21 '24 00:05 ccollie

@igor-elbert @ccollie

Hi, we just refined how you can add your preferred LLM and Document Store. You only need to define LLM and Document Store and their environment variables! For details please check out the detailed guide here: https://docs.getwren.ai/installation/custom_llm

For adding ollama, I've created a branch for this and made a minimal implementation, feel free to check it out: https://github.com/Canner/WrenAI/tree/feature/ai-service/add-ollama

ONE CAVEAT: after you define your own LLM, you may find ai pipelines break, that's because your LLM may not suitable for the prompts, so you need to do prompt engineering at the moment. In the future, we'll come up with ways for you to easily extend and customize your prompts. Or welcome to share your thoughts here. As of now, I suppose prompts and respective LLM should match to have the best performance.

cyyeh avatar May 21 '24 08:05 cyyeh

If there are no more issues, I'll close this issue then. Thank you :)

cyyeh avatar May 21 '24 08:05 cyyeh

You might also be interested in these models that you can run locally and they can generate SQL:

https://ollama.com/library/sqlcoder https://ollama.com/library/codeqwen https://ollama.com/library/starcoder2

qdrddr avatar May 22 '24 01:05 qdrddr

We'll merge the add-ollama branch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer

cyyeh avatar May 22 '24 11:05 cyyeh

Might also be useful for this project since WrenAi has DuckDB already. duckdb-nsql model

qdrddr avatar May 31 '24 17:05 qdrddr

We'll merge the add-ollama branch to the main branch after we make sure it won't break our ai pipelines currently. We will investigate some ways to solve the issue. For example, https://github.com/noamgat/lm-format-enforcer

To enforce some format you might need a model that supports function calling such as Mistral 7B v0.3. please note this model might not be particularly powerful with SQL generation. @cyyeh

qdrddr avatar May 31 '24 18:05 qdrddr

FYI, there are two most popular inference engines: Ollama (partly compatible with OpenAI APIs), mostly using its own APIs. and LocalAI (Tend to be almost fully compatible with OpenAI APIs)

I would suggest using LiteLLM framework that can augment different LLM providers, and make it easier to maintain and add new once.

qdrddr avatar Jun 06 '24 17:06 qdrddr

all, the ollama has been integrated in this branch, also you can use openai api compatible llm: chore/ai-service/update-env we'll merge this branch to the main branch in the near future and update the documentation as of now, I'll delete the original ollama branch Thank you all for your patience

related pr: https://github.com/Canner/WrenAI/pull/376

cyyeh avatar Jun 11 '24 05:06 cyyeh

All, we now support using Ollama and OpenAI API-compatible LLMs now with the latest release: https://github.com/Canner/WrenAI/releases/tag/0.6.0

Setups on how to run Wren AI using custom LLMs: https://docs.getwren.ai/installation/custom_llm#running-wren-ai-with-your-custom-llm-or-document-store

Currently, there is one obvious limitation for custom LLMs: you need to use the same provder(such as OpenAI, or Ollama) for LLM and embedding model. We'll fix that and release a new version soon. Stay tuned 🙂

I'll close this issue as completed now.

cyyeh avatar Jun 28 '24 23:06 cyyeh