autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Create constants and add docs to Model Clients for external LLMs through OpenAI API

Open vballoli opened this issue 1 year ago • 6 comments

Now that multiple external LLMs are supported through the OpenAI API:

  1. Allow for creating _model_info as a class variable in OpenAIChatCompletionClient to support external models through the OpenAI API (and support popular LLMs through standard model_info classes (https://github.com/microsoft/autogen/blob/f40b0c27307696a996817f9cf52bd16ac8eae291/python/packages/autogen-core/src/autogen_core/components/models/_openai_client.py#L336: _model_info here can also be assigned externally - shown in the example below)
  2. Add external capabilities to docs here: https://microsoft.github.io/autogen/dev/user-guide/core-user-guide/framework/model-clients.html

I can send a PR if this makes sense.

TLDR - to enable code like this:

Example remains same

import asyncio
# asyncio.run(main()) 
from autogen_core.components.models import OpenAIChatCompletionClient, UserMessage


class GeminiModelInfo:

    # support Gemini-1.5-flash, pro

    def get_capabilities(model:str): ... # returns {vision, ...}
    def resolve_model():
    def get_token_limit():
    def get_base_url():

# Create an OpenAI model client.
model_client = OpenAIChatCompletionClient(api_key="API_KEY", model='gemini-1.5-flash', model_info=GeminiModelInfo)

model_client_result = asyncio.run(model_client.create(
    messages=[
        UserMessage(content="What is the capital of France?", source="user"),
    ]
))
print(model_client_result)  # "Paris"

vballoli avatar Nov 09 '24 19:11 vballoli

@victordibia I can send a PR if this looks okay ?

vballoli avatar Nov 11 '24 16:11 vballoli

Its a good idea! In general, I think we need to have a structured design/approach to supporting any model client. Some of this might already have been discussed in a different issue, let me loop in @ekzhu and @jackgerrits for thoughts here.

victordibia avatar Nov 11 '24 16:11 victordibia

So the reason I put this up is this felt repetitive to me: (especially when working with multi-model multi-agent systems and thought would be a good feature to be able to improve usability / clean code when working with these multi-model multi-agent systems.

This is my current code based on the example before any changes

import asyncio
# asyncio.run(main()) 
from autogen_core.components.models import OpenAIChatCompletionClient, UserMessage, AzureOpenAIChatCompletionClient

# Create an OpenAI model client.
model_client = OpenAIChatCompletionClient(model="gemini-1.5-flash", base_url="https://generativelanguage.googleapis.com/v1beta/", api_key="API_KEY", model_capabilities={
        "vision": True,
        "function_calling": True,
        "json_output": True,
    })

model_client_result = asyncio.run(model_client.create(
    messages=[
        UserMessage(content="What is the capital of France?", source="user"),
    ]
))
print(model_client_result)  # "Paris"

vballoli avatar Nov 11 '24 17:11 vballoli

Looks like Gemini supports OpenAI chat completion client now: https://developers.googleblog.com/en/gemini-is-now-accessible-from-the-openai-library/

I think having a list of model capabilities built-in and avoid repeated input is a good idea. We might want to do this for local models even.

@vballoli are you interested in adding the current usage of Google Gemini to the doc here: python/packages/autogen-core/docs/src/user-guide/agentchat-user-guide/tutorial/models.ipynb

Looking to @jackgerrits

ekzhu avatar Nov 11 '24 21:11 ekzhu

@ekzhu Yes, happy to add this to the docs.

The potential concern is any functionality with token_limits would still result in an error - that is why I was hoping the _model_info could be transformed into a class-based thing. Will send a PR in a few hours for the docs

vballoli avatar Nov 11 '24 21:11 vballoli

Hi, Any updates on this?

I am getting a models/gpt-4o is not found when using gemini with google OAI-compatible base url in magentic since it's sending gpt-4o despite setting the correct environment models with gemini-1.5-flash

nullnuller avatar Nov 18 '24 00:11 nullnuller

@nullnuller you need to set the model capabilities. See the API doc on using OpenAIChatCompletionClient: https://microsoft.github.io/autogen/dev/reference/python/autogen_ext/autogen_ext.models.html#autogen_ext.models.OpenAIChatCompletionClient

ekzhu avatar Nov 18 '24 02:11 ekzhu

Closing this due to #4232

Plan is to create community packages for gemini-specific client. See guide on how to create community packages: https://microsoft.github.io/autogen/dev/user-guide/extensions-user-guide/index.html

ekzhu avatar Nov 18 '24 02:11 ekzhu

I'm currently working on the community packagers for external LLMs through the OpenAI client @nullnuller - I'll keep the conversation updated

vballoli avatar Nov 18 '24 02:11 vballoli

I'm currently working on the community packagers for external LLMs through the OpenAI client @nullnuller - I'll keep the conversation updated

@vballoli just wondering if the community package have been done? Thanks for your efforts.

nullnuller avatar Nov 26 '24 08:11 nullnuller

I'll be maintaining it here: https://github.com/vballoli/autogen-openaiext-client. Apologies for not finishing it sooner, I'll get to it after thanksgiving with the proposed timeline in the README. @nullnuller

vballoli avatar Nov 28 '24 01:11 vballoli

I've made updates and tested with magentic-one. There seem to be issues with tool calling in Gemini, but apart from that everything seems to be working fine. I will follow up with better tests and make sure it is easily installable through pypi in the next few days. @nullnuller

vballoli avatar Nov 29 '24 11:11 vballoli