rag-chat icon indicating copy to clipboard operation
rag-chat copied to clipboard

Integrating local LLMs

Open ogzhanolguncu opened this issue 1 year ago • 0 comments

Currently, the RAG SDK only supports hosted models. If we could enable the use of local models, similar to web-llm, that would be great. The only issue is that while they are OpenAI-compatible, they don’t provide an endpoint to query from. I believe we could write a simple HTTP server using bun.js and trigger it only if the user decides to use one of the local LLMs.

For example:

  1. Start the model.
  2. Hook it up to a simple server.
  3. Generate a predefined URL and provide it to our base LLM client. And, add it like any other model.

The rest should be straightforward.

  • [ ] https://github.com/mlc-ai/web-llm
  • [x] https://ollama.com/

ogzhanolguncu avatar Aug 13 '24 08:08 ogzhanolguncu