anything-llm icon indicating copy to clipboard operation
anything-llm copied to clipboard

Swapping OpenAI with local LLM?

Open kfeeeeee opened this issue 1 year ago • 7 comments

Hi,

Basically title. THe intro suggests that openai-access can be replaced with locally running models (maybe with oobabooga-openai-api?) Anyway, can't seem to find instructions / env settings for it. Could you tell me if it has bee implented already?

kfeeeeee avatar Jun 18 '23 11:06 kfeeeeee

Not yet as far as I can see...

Nasnl avatar Jun 19 '23 11:06 Nasnl

Not currently, the issue with localLLM or other local LLM programs is you need an API accessible endpoint much like what GPT4ALL provides. There are some API wrappers for LocalLLM that some people have built that would work in this instance.

More work needs to be done around this - open to PRs, embeddings is another issue altogether

timothycarambat avatar Jun 19 '23 19:06 timothycarambat

Oka, thank you. Maybe a good starting point would be oobabooga's text-webui since it's capable of mimicking an openai API on port 5001.

kfeeeeee avatar Jun 19 '23 19:06 kfeeeeee

Not currently, the issue with localLLM or other local LLM programs is you need an API accessible endpoint much like what GPT4ALL provides. There are some API wrappers for LocalLLM that some people have built that would work in this instance.

More work needs to be done around this - open to PRs, embeddings is another issue altogether

Gradio usually comes with a working API

simonSlamka avatar Jun 20 '23 07:06 simonSlamka

i could suggest chromadb for local embeddings - it's already set up and I've gotten it to work with two docker instances (localhost changes to host.docker.internal). getting a wrapper to work with LocalLLM is something I haven't tried yet though. If that works, this could become a fully autonomous solution for document search and chat, though possibly slow.

AntonioCiolino avatar Jun 20 '23 16:06 AntonioCiolino

I got close, but the chromadb code has code in it that force calls openai currently.

AntonioCiolino avatar Jun 26 '23 11:06 AntonioCiolino

I should be more clear - I was using LocalAI to read through to a local LLM with the OpenAI API calls. After commenting out the /moderations endpoint (which LocalAI doesn't handle), I was eventually able to call LocalAI and get it to return. However, I discovered that the embeddings - which I also overrode to use BERT, don't work in AnythingLLM, as it's expecting embeddings to be OpenAI only. In the process I've managed to mess some of the indexing up; new files aren't getting properly found. I'm probably going to have to wipe this all clean to purge out the BERTs :) LocalAI can call OpenAI and get the OpenAI embeddings, so it's not a complete failure; I was able to get GPT4ALL to connect and respond to the questions.

To summarize: It does work, with some tweaking, but unless this project supports other types of embeddings (which I don't think is a calling), a fully non-connected Local LLM model isn't likely.

Too bad; I really want to do the stuff completely offline.

AntonioCiolino avatar Jun 26 '23 13:06 AntonioCiolino

Hi @AntonioCiolino @simonSlamka @kfeeeeee @timothycarambat I’m the maintainer of LiteLLM - we allow you to create a proxy server to call 100+ LLMs, and I think it can solve your problem (I'd love your feedback if it does not)

Try it here: https://docs.litellm.ai/docs/proxy_server

Using LiteLLM Proxy Server

import openai
openai.api_base = "http://0.0.0.0:8000/" # proxy url
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

Creating a proxy server

Ollama models

$ litellm --model ollama/llama2 --api_base http://localhost:11434

Hugging Face Models

$ export HUGGINGFACE_API_KEY=my-api-key #[OPTIONAL]
$ litellm --model claude-instant-1

Anthropic

$ export ANTHROPIC_API_KEY=my-api-key
$ litellm --model claude-instant-1

Palm

$ export PALM_API_KEY=my-palm-key
$ litellm --model palm/chat-bison

ishaan-jaff avatar Sep 29 '23 03:09 ishaan-jaff

@ishaan-jaff have you managed to get anything-llm running using the proxy workaround you suggested? If so, can you describe the steps?

danielnbalasoiu avatar Oct 12 '23 23:10 danielnbalasoiu

@ishaan-jaff Would really like to see NodeJS support for all the models the python client supports

timothycarambat avatar Oct 28 '23 06:10 timothycarambat

PR #335

franzbischoff avatar Nov 06 '23 18:11 franzbischoff

LMStudio integration is now live: f499f1ba59f2e9f8be5e44c89a951e859382e005

timothycarambat avatar Nov 09 '23 20:11 timothycarambat

Moving conversation to #118

timothycarambat avatar Nov 09 '23 20:11 timothycarambat