autogen
autogen copied to clipboard
[Issue]: How to use (free) Hugging Face inference in Autogen Studio with setting up .env?
Describe the issue
Since playing around with Autogen Studio is best for free, but not having decent hardware to run LLMs locally, I want to use free Hugging Face inference, so I selected Zephyr-7B, which has free inference at: https://api-inference.huggingface.co/models/HuggingFaceH4/zephyr-7b-beta So I added it as LLM to an Agent, but it doesn't work, it misses API keys, even though my OPENAI_API_KEY and HUGGINGFACEHUB_API_TOKEN (which required to use HF APIs) are set in .env in project root.
The latest error is:
raise OpenAIError(
openai.OpenAIError: The api_key client option must be set either by passing api_key to the client or by setting the OPENAI_API_KEY environment variable
I was able to use in my tests this free HF API, so the issue is Autogen Studio related. So how to set up HF API key in Autogen Studio (in .env)? My other option is to use payed HF, Replicate, Mistral etc. inference, where I also should have to add my API key, so where and how should I do that?
Steps to reproduce
No response
Screenshots and logs
No response
Additional Information
Autogen: 0.2.7 Autogen Studio: 0.0.28a0 Python: 3.11.6 Windows 10
Hi @vanetreg ,
Thanks for raising this. Do you know if the hf inference api is an openai compatible endpoint? AutoGen (and autogenstudio) standardize on llm endpoints that are oai compliant endpoints (take a base url, api key, to work). I am not sure if a .env file will necessarily enable hf endpoint support. Happy to hear your thoughts.
... Do you know if the hf inference api is an openai compatible endpoint? AutoGen (and autogenstudio) standardize on llm endpoints that are oai compliant endpoints (take a base url, api key, to work). I am not sure if a .env file will necessarily enable hf endpoint support. Happy to hear your thoughts.
Hi @victordibia , thank you for your response! :) I think it is not compatible, even Hugging Face Chat (with NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO) says that ( https://huggingface.co/chat ). When I was able to use HF or other LLM inference, the tested code was based on either Langchain or LiteLLM, in both cases with clear instructions of specific env vars setup.
I watched 5+ videos on YT about Autogen Studio, but all shows only the export of OPENAI_API_KEY on terminal ( iOS), but I'm on Windows and prefer to see all related env vars easily in one place, like in .env file. I didn't find any OAI_CONFIG_LIST_sample or .env sample to modify for Studio...
So what should we enter on the Studio frontend when we set up LLMs to the "API key" field when we try to use HF, Runpod, Replicate, Gradio or local LLMs?
I'd love to test Autogen / Studio with some Microsoft models on HF inference like Phi-2 or Orca-2... :)
If MS wants to help devs and citizen devs like me to use alternative API's ( I have doubts :-/ ) with Autogen and Studio, pls. add clear instructions / examples to docs / code samples. Learning and playing around such cognitive architectures can be very expensive, even with using its caching. Thank you!
Hi @vanetreg, I think you can use LiteLLM proxy to call any external model API. LiteLLM provides openai compatible api link. You can look at this video for more details: https://youtu.be/-Wo025I-_I4?t=238
anyone came up with a way to connect huggingchat models to autogen. that would be something
https://huggingface.co/spaces/Tonic/EasyYI/blob/main/app.py you can make a connector like this , based on the endpoint you're using / serving , which basically replicates an openai api (easy) then use it accordingly
https://huggingface.co/spaces/Tonic/EasyYI/blob/main/app.py you can make a connector like this , based on the endpoint you're using / serving , which basically replicates an openai api (easy) then use it accordingly
Can anyone confirm this works?? i am also very interested in getting this working
This looks like a very light scaffolding, but perhaps this HF model layer could help: https://github.com/Solonce/HFAutogen
There are a lot of place holders in the base class pointing to mixtral, so it may require a few more patches to be totally model agnostic. Overall, breaking that lock to just OAI interfaces would help to grow LLMs, image, and other capabilities to AutoGen's capabilities.
workarounds provided and other examples available