langchain_dart icon indicating copy to clipboard operation
langchain_dart copied to clipboard

Offline HuggingFace Models

Open jtkeyva opened this issue 1 year ago • 6 comments

Feature request

Would be nice to support HuggingFace models specifically: distilbert-base-uncased-finetuned-sst-2-english

Motivation

To integrate OpenAI, Google and Hugging Face models

Your contribution

I can submit ideas and concepts

jtkeyva avatar Sep 19 '23 06:09 jtkeyva

Hey @jtkeyva,

Thanks for opening the issue.

Can you give some more detail on how you envision the integration? Do you know how we can communicate with those models locally? (e.g. ffi, client-server)

davidmigloz avatar Sep 21 '23 18:09 davidmigloz

Well, simply using Hugging Face as a "model source" along with OpenAI and MLKit and the ability store and use the models offline. So you could be a generic LLM API where people could make their own "recepies" kinda like IFTTT or Zapier. So you can connect different models wether offline or using an online API.

Offline: https://pub.dev/packages/whisper_flutter_plus https://pub.dev/packages/fl_mlkit_translate_text https://pub.dev/packages/google_ml_kit https://saturncloud.io/blog/how-to-download-hugging-face-sentimentanalysis-pipeline-for-offline-use/

Oh I just saw this: https://pub.dev/packages/langchain_huggingface

It's all pipes ha

jtkeyva avatar Sep 21 '23 19:09 jtkeyva

At the moment, we have an open issue with integrating with HuggingFaceHub. So, for example, you could use distilbert-base-uncased-finetuned-sst-2-english via their inference endpoint. Will that be sufficient for your use case?

*langchain_huggingface package is currently a placeholder for #19 implementation.

davidmigloz avatar Sep 21 '23 20:09 davidmigloz

yeah that's a great start. i was thinking offline capabilities would take things to the next level

jtkeyva avatar Sep 22 '23 05:09 jtkeyva

Hi, we'd be interested in this. How does one start?

nundys avatar Sep 26 '23 12:09 nundys

Hi @davidmigloz I’m the maintainer of LiteLLM (abstraction to call 100+ LLMs)- we allow you to create a proxy server to call 100+ LLMs, and I think it can solve your problem (I'd love your feedback if it does not)

Try it here: https://docs.litellm.ai/docs/proxy_server https://github.com/BerriAI/litellm

Using LiteLLM Proxy Server

import openai
openai.api_base = "http://0.0.0.0:8000/" # proxy url
print(openai.ChatCompletion.create(model="test", messages=[{"role":"user", "content":"Hey!"}]))

Creating a proxy server

Ollama models

$ litellm --model ollama/llama2 --api_base http://localhost:11434

Hugging Face Models

$ export HUGGINGFACE_API_KEY=my-api-key #[OPTIONAL]
$ litellm --model claude-instant-1

Anthropic

$ export ANTHROPIC_API_KEY=my-api-key
$ litellm --model claude-instant-1

Palm

$ export PALM_API_KEY=my-palm-key
$ litellm --model palm/chat-bison

ishaan-jaff avatar Sep 29 '23 04:09 ishaan-jaff