autogen
autogen copied to clipboard
[DRAFT] Add OpenAI Client Error Handler
Why are these changes needed?
This PR introduces an error handler decorator on the openai client. The implementation allow users to specify error callbacks in case the client fails. The goal is to always provide a create result that the client can propagate to the agent to avoid application termination due to the exception.
There are default implementatios for the content safety error and a retry mechanism for transient errors. Still needs to check how the change will propagate to agentchat.
Related issue number
Checks
- [ ] I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://github.com/microsoft/autogen/blob/main/CONTRIBUTING.md to build and test documentation locally.
- [ ] I've added tests (if relevant) corresponding to the changes introduced in this PR.
- [ ] I've made sure all auto checks have passed.
@ekzhu , can you have a look and tag the relevant people for feedback?
Thanks for the draft PR! It presents a good starting point. For create_stream the caller will be consume the output as an async generator. By wrapping the create_stream method, do you think it may result in weird behavior such as the generator restarts from the beginning without the caller's knowledge?
Thanks for the draft PR! It presents a good starting point. For
create_streamthe caller will be consume the output as an async generator. By wrapping thecreate_streammethod, do you think it may result in weird behavior such as the generator restarts from the beginning without the caller's knowledge?
What I think may happen is that if the exception occurs in the middle of the stream, then the client will start streaming the response and then it may retry and there may be a reset in the stream without any clear warning (I'm not sure there is much we can do, maybe add a warning in the callback to make it clear that there was an exception?)
I think the final processing of the CreateResult will flow as expected.
For create_stream I think a warning may not be sufficient, because the caller is not notified gracefully -- only the user inspecting the output is notified. It's gonna be tricky, but I think the best way from the caller's point of view is to emit some kind of truncated CreateResult signaling a pre-mature finish (TruncatedCreateResult, for example) but more string chunks will come. So the caller can handle it appropriately. From the AssistantAgent it can then emit a ModelClientStreamResetEvent to signal a reset of the stream due to error or retry.
It's all too complicated for one PR. Perhaps we can address create_stream separately in the future.
For
create_streamI think a warning may not be sufficient, because the caller is not notified gracefully -- only the user inspecting the output is notified. It's gonna be tricky, but I think the best way from the caller's point of view is to emit some kind of truncatedCreateResultsignaling a pre-mature finish (TruncatedCreateResult, for example) but more string chunks will come. So the caller can handle it appropriately. From theAssistantAgentit can then emit aModelClientStreamResetEventto signal a reset of the stream due to error or retry.It's all too complicated for one PR. Perhaps we can address
create_streamseparately in the future.
@ekzhu For this one, how do you want to split into separate PRs? Do you want to create the the TruncatedCreateResult. in this one and leave the updates to AssistantAgent to another PR?
Hi, is this still being worked on? Getting killed by rate limits today and was hoping someone was working on a handler.
Hi, is this still being worked on? Getting killed by rate limits today and was hoping someone was working on a handler.
Let me try to finish it up.
Codecov Report
Attention: Patch coverage is 74.22680% with 25 lines in your changes missing coverage. Please review.
Project coverage is 79.51%. Comparing base (
c683175) to head (80c4629). Report is 69 commits behind head on main.
| Files with missing lines | Patch % | Lines |
|---|---|---|
| ...togen_ext/models/openai/_error_handler_callback.py | 71.59% | 25 Missing :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## main #5615 +/- ##
==========================================
- Coverage 79.54% 79.51% -0.04%
==========================================
Files 225 226 +1
Lines 16669 16766 +97
==========================================
+ Hits 13260 13332 +72
- Misses 3409 3434 +25
| Flag | Coverage Δ | |
|---|---|---|
| unittests | 79.51% <74.22%> (-0.04%) |
:arrow_down: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.