autogen icon indicating copy to clipboard operation
autogen copied to clipboard

[Roadmap]: Gemini Integration

Open BeibinLi opened this issue 1 year ago • 8 comments

[!TIP]

Want to get involved?

We'd love it if you did! Please get in contact with the people assigned to this issue, or leave a comment. See general contributing advice here too.

Intro

Gemini is a Google GenAI product that offers another set of GenAI models for users. Since December 2023, our "gemini" branch has seen significant community interest, including the integration of Gemma with autogen earlier this year. We're now excited to bring the experimental Gemini branch into AutoGen's main release. While we currently offer Gemini chat completion and vision models, this roadmap aims to unlock more potential of Google Gemini within AutoGen, laying out TODOs, feature requests, and future directions.

The Roadmap

Merged PR: https://github.com/microsoft/autogen/pull/2360

Future:

  • GenerationConfig Parameters: temperature, max_tokens, etc.
  • Multiple Responses: generate several responses at the same time.
  • More Accurate Cost/Token Calculation. Great feedback from @joshkyh
  • Function Calling.
  • JSON Output: structure outputs in JSON.
  • Image Generation.
  • Other Modalities: audio/video generation and processing.
  • Safety Settings: user-customizable safety controls. Feedback from @marklysze chat.send_message("hi", stream=stream, safety_settings=safety). documentation
  • Multi-turn Vision Chat: improve conversational context for image tasks.
  • Streaming: real-time output from the client.

BeibinLi avatar Apr 15 '24 03:04 BeibinLi

Hi! Is it the list of planned features for future or is it what have been done? I don't see safety settings or generation config in this PR, and wanted to be active contributor to the repo because im gonna be using it very actively. Could I help with it?

  • GenerationConfig Parameters: temperature, max_tokens, etc.
  • Multiple Responses: generate several responses at the same time.
  • Cost/Token Calculation. Great feedback from @joshkyh , documentation
  • Function Calling.
  • JSON Output: structure outputs in JSON.
  • Image Generation.
  • Other Modalities: audio/video generation and processing.
  • Safety Settings: user-customizable safety controls. Feedback from @marklysze chat.send_message("hi", stream=stream, safety_settings=safety). documentation
  • Multi-turn Vision Chat: improve conversational context for image tasks.
  • Streaming: real-time output from the client.

NikolayTV avatar Apr 17 '24 10:04 NikolayTV

@NikolayTV the list are tasks not yet completed. If you are interested to contribute, we can discuss here and create an issue for you. cc @BeibinLi @sonichi

ekzhu avatar Apr 17 '24 16:04 ekzhu

Thanks! Here is my first PR #2429

NikolayTV avatar Apr 18 '24 08:04 NikolayTV

How can I contribute?

abhinavankur avatar Apr 23 '24 11:04 abhinavankur

How can I contribute? Welcome @abhinavankur You may find the contribution guide at https://microsoft.github.io/autogen/docs/contributor-guide/contributing/

randombet avatar Apr 23 '24 18:04 randombet

My PR for Function Calling support #2793

arjun-g avatar May 25 '24 18:05 arjun-g

Hi I would like to help/contribute as well. Any open issues from the list? (cc hello @ekzhu @sonichi )

yeounoh avatar Jun 20 '24 22:06 yeounoh

Hi I would like to help/contribute as well. Any open issues from the list? (cc hello @ekzhu @sonichi )

I raised a feature request for caching, (https://github.com/microsoft/autogen/issues/3038) -- I can take this as my first contribution?

yeounoh avatar Jun 28 '24 18:06 yeounoh