RasaGPT icon indicating copy to clipboard operation
RasaGPT copied to clipboard

switch openAI API to other open source LLMs

Open spinning27 opened this issue 1 year ago • 9 comments

I wonder if it is possible to switch openAI API to other open source LLMs.

Thanks

spinning27 avatar May 09 '23 12:05 spinning27

yea, that's totally possible.

specifically which ones did you have in mind? there's a new one every 3 days ;)

paulpierre avatar May 09 '23 19:05 paulpierre

@paulpierre , thanks for your prompt reply.

How about quantized llama 7b that has been in town for a bit time. :)

spinning27 avatar May 09 '23 20:05 spinning27

of course and thanks @spinning27

what is your host operating system? i can look into creating a LLaMa branch over the weekend because I'm actually curious on the implementation.

some questions before exploring:

  1. base LLaMa isn't fine-tuned for QA/chat AFAIR. would you be open to other optimal options:
  1. would it make more sense to run remote inference on HF vs local. i think most devs stand to benefit from this convenience

let me know your thoughts 👍

paulpierre avatar May 10 '23 13:05 paulpierre

Actually, anyone @paulpierre mentioned would do as long as it is free (the cost of openAI API access could go up quite quickly with lots of query).

ATM, I mainly play it for out of curiosity. I have access to a GPU server machine or run locally on my M1 machine.

StableVicana-13B seems memory hungry. Not managed to run it on the machine yet, except quantized (compressed) version via llama.cpp.

spinning27 avatar May 10 '23 16:05 spinning27

@paulpierre First of all, great concept! This is going to get a lot of traction for sure.

About an open-source LLM, wouldn't it be best to use GPT4All? It can be used commercially as well since there is a variant that's based on GPT-J. Also, it is supported by Langchain which means that retrieval augmented QA will also be relatively easy.

isu-shrestha avatar May 10 '23 20:05 isu-shrestha

Langchain currently develops at break-neck speed. Not sure sometimes it does not work like this one.

spinning27 avatar May 11 '23 17:05 spinning27

@spinning27 Fair. However, an argument can be made to pick a version that works, and stick to it? In any case, I think having the ability to use open-source LLMs would definitely be interesting to people and organizations that want to decentralize, and protect their data from third party APIs.

isu-shrestha avatar May 12 '23 01:05 isu-shrestha

The best solution to beat the pricing of OPENAI is use your own deployed llms using fastChat or textgen-ui, they have nice openai api

However, A M1 laptop can easily run llm locally and this can be used for any prototype thing.

Sample code.

llm = LlamaCpp(model_path="models/llama-7b.ggmlv3.q4_0.bin", n_ctx=2048, verbose=True) embeddings_model = LlamaCppEmbeddings(model_path="models/llama-7b.ggmlv3.q4_0.bin")

vchauhan1 avatar Jul 27 '23 20:07 vchauhan1

Been looking to build this myself and there we go - a fine-crafted project already there. I've made a replacement for ada embeddings @paulpierre - perhaps hosting it locally would cut on running costs = https://github.com/proitservices/elmo_embedding_api also, GPT4ALL is a great choice and with DB + Langchain the vicuna13b model would do just amazing (mistral 7b is also a good choice). Would love to see this project grow into a fully featured and capable 'Jarvis' with memory and math capabilities + a rain of APIs extensions oob. Happy to help with those, Peter

proitservices avatar Nov 24 '23 09:11 proitservices