autogen icon indicating copy to clipboard operation
autogen copied to clipboard

[Issue]: How to use (free) Hugging Face inference in Autogen Studio with setting up .env?

Open vanetreg opened this issue 1 year ago • 7 comments

Describe the issue

Since playing around with Autogen Studio is best for free, but not having decent hardware to run LLMs locally, I want to use free Hugging Face inference, so I selected Zephyr-7B, which has free inference at: https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta So I added it as LLM to an Agent, but it doesn't work, it misses API keys, even though my OPENAI_API_KEY and HUGGINGFACEHUB_API_TOKEN (which required to use HF APIs) are set in .env in project root.

The latest error is:

    raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable

I was able to use in my tests this free HF API, so the issue is Autogen Studio related. So how to set up HF API key in Autogen Studio (in .env)? My other option is to use payed HF, Replicate, Mistral etc. inference, where I also should have to add my API key, so where and how should I do that?

Steps to reproduce

No response

Screenshots and logs

No response

Additional Information

Autogen: 0.2.7 Autogen Studio: 0.0.28a0 Python: 3.11.6 Windows 10

vanetreg avatar Jan 18 '24 18:01 vanetreg

Hi @vanetreg ,

Thanks for raising this. Do you know if the hf inference api is an openai compatible endpoint? AutoGen (and autogenstudio) standardize on llm endpoints that are oai compliant endpoints (take a base url, api key, to work). I am not sure if a .env file will necessarily enable hf endpoint support. Happy to hear your thoughts.

victordibia avatar Jan 19 '24 02:01 victordibia

... Do you know if the hf inference api is an openai compatible endpoint? AutoGen (and autogenstudio) standardize on llm endpoints that are oai compliant endpoints (take a base url, api key, to work). I am not sure if a .env file will necessarily enable hf endpoint support. Happy to hear your thoughts.

Hi @victordibia , thank you for your response! :) I think it is not compatible, even Hugging Face Chat (with NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO) says that ( https://huggingface.co/chat ). When I was able to use HF or other LLM inference, the tested code was based on either Langchain or LiteLLM, in both cases with clear instructions of specific env vars setup.

I watched 5+ videos on YT about Autogen Studio, but all shows only the export of OPENAI_API_KEY on terminal ( iOS), but I'm on Windows and prefer to see all related env vars easily in one place, like in .env file. I didn't find any OAI_CONFIG_LIST_sample or .env sample to modify for Studio...

So what should we enter on the Studio frontend when we set up LLMs to the "API key" field when we try to use HF, Runpod, Replicate, Gradio or local LLMs?

I'd love to test Autogen / Studio with some Microsoft models on HF inference like Phi-2 or Orca-2... :)

If MS wants to help devs and citizen devs like me to use alternative API's ( I have doubts :-/ ) with Autogen and Studio, pls. add clear instructions / examples to docs / code samples. Learning and playing around such cognitive architectures can be very expensive, even with using its caching. Thank you!

vanetreg avatar Jan 19 '24 08:01 vanetreg

Hi @vanetreg, I think you can use LiteLLM proxy to call any external model API. LiteLLM provides openai compatible api link. You can look at this video for more details: https://youtu.be/-Wo025I-_I4?t=238

gauravdhiman avatar Jan 28 '24 21:01 gauravdhiman

anyone came up with a way to connect huggingchat models to autogen. that would be something

bpawnzZ avatar Apr 11 '24 01:04 bpawnzZ

https://huggingface.co/spaces/Tonic/EasyYI/blob/main/app.py you can make a connector like this , based on the endpoint you're using / serving , which basically replicates an openai api (easy) then use it accordingly

Josephrp avatar Apr 11 '24 12:04 Josephrp

https://huggingface.co/spaces/Tonic/EasyYI/blob/main/app.py you can make a connector like this , based on the endpoint you're using / serving , which basically replicates an openai api (easy) then use it accordingly

Can anyone confirm this works?? i am also very interested in getting this working

bpawnzZ avatar Apr 14 '24 19:04 bpawnzZ

This looks like a very light scaffolding, but perhaps this HF model layer could help: https://github.com/Solonce/HFAutogen

There are a lot of place holders in the base class pointing to mixtral, so it may require a few more patches to be totally model agnostic. Overall, breaking that lock to just OAI interfaces would help to grow LLMs, image, and other capabilities to AutoGen's capabilities.

ezavesky avatar May 08 '24 12:05 ezavesky

workarounds provided and other examples available

rysweet avatar Oct 18 '24 19:10 rysweet