autogen icon indicating copy to clipboard operation
autogen copied to clipboard

Enable gemini context caching

Open yeounoh opened this issue 1 year ago • 5 comments
trafficstars

Why are these changes needed?

Gemini model API introduced a new context caching feature that caches the prompt prefix. This PR implements enabled this new feature in GeminiClient to help reduce the cost of using the latest gemini models. Note that this is a gemini specific feature and used for caching the prompt prefix, not agent's input and output.

Related issue number

Addresses/closes #3038

Checks

  • [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
  • [x] I've added tests (if relevant) corresponding to the changes introduced in this PR.
  • [ ] I've made sure all auto checks have passed.

yeounoh avatar Jul 24 '24 23:07 yeounoh

@yeounoh please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree

yeounoh avatar Jul 24 '24 23:07 yeounoh

Codecov Report

Attention: Patch coverage is 43.47826% with 26 lines in your changes missing coverage. Please review.

Please upload report for BASE (0.2@f9295c4). Learn more about missing BASE report.

Files with missing lines Patch % Lines
autogen/oai/gemini.py 43.47% 26 Missing :warning:
Additional details and impacted files
@@          Coverage Diff           @@
##             0.2    #3207   +/-   ##
======================================
  Coverage       ?   13.82%           
======================================
  Files          ?       97           
  Lines          ?    10849           
  Branches       ?     2488           
======================================
  Hits           ?     1500           
  Misses         ?     9313           
  Partials       ?       36           
Flag Coverage Δ
unittests 13.78% <43.47%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov-commenter avatar Jul 26 '24 01:07 codecov-commenter

It's failing to build due to missing google package. I will add a notebook to demonstrate the usage, before review.

yeounoh avatar Jul 26 '24 17:07 yeounoh

@yeounoh In your test case, you can move the "import" line into the existing "try...catch..." clause.

BeibinLi avatar Jul 28 '24 17:07 BeibinLi

@yeounoh Is this PR ready to be reviewed?

ekzhu avatar Oct 01 '24 04:10 ekzhu

Hi @yeounoh - we've rebased and updated this for you. there are a couple of conflicts still. If you think this is ready for review please update to resolve the conflicts and then we will review.

rysweet avatar Oct 12 '24 00:10 rysweet

Closing in favor of https://github.com/microsoft/autogen/pull/5524 for 0.4. Thanks for your work on this though!

jackgerrits avatar Feb 26 '25 18:02 jackgerrits