autogen icon indicating copy to clipboard operation
autogen copied to clipboard

[DRAFT] Add OpenAI Client Error Handler

Open lspinheiro opened this issue 9 months ago • 8 comments

Why are these changes needed?

This PR introduces an error handler decorator on the openai client. The implementation allow users to specify error callbacks in case the client fails. The goal is to always provide a create result that the client can propagate to the agent to avoid application termination due to the exception.

There are default implementatios for the content safety error and a retry mechanism for transient errors. Still needs to check how the change will propagate to agentchat.

Related issue number

Checks

lspinheiro avatar Feb 19 '25 07:02 lspinheiro

@ekzhu , can you have a look and tag the relevant people for feedback?

lspinheiro avatar Feb 19 '25 07:02 lspinheiro

Thanks for the draft PR! It presents a good starting point. For create_stream the caller will be consume the output as an async generator. By wrapping the create_stream method, do you think it may result in weird behavior such as the generator restarts from the beginning without the caller's knowledge?

ekzhu avatar Feb 21 '25 19:02 ekzhu

Thanks for the draft PR! It presents a good starting point. For create_stream the caller will be consume the output as an async generator. By wrapping the create_stream method, do you think it may result in weird behavior such as the generator restarts from the beginning without the caller's knowledge?

What I think may happen is that if the exception occurs in the middle of the stream, then the client will start streaming the response and then it may retry and there may be a reset in the stream without any clear warning (I'm not sure there is much we can do, maybe add a warning in the callback to make it clear that there was an exception?)

I think the final processing of the CreateResult will flow as expected.

lspinheiro avatar Feb 21 '25 23:02 lspinheiro

For create_stream I think a warning may not be sufficient, because the caller is not notified gracefully -- only the user inspecting the output is notified. It's gonna be tricky, but I think the best way from the caller's point of view is to emit some kind of truncated CreateResult signaling a pre-mature finish (TruncatedCreateResult, for example) but more string chunks will come. So the caller can handle it appropriately. From the AssistantAgent it can then emit a ModelClientStreamResetEvent to signal a reset of the stream due to error or retry.

It's all too complicated for one PR. Perhaps we can address create_stream separately in the future.

ekzhu avatar Feb 22 '25 03:02 ekzhu

For create_stream I think a warning may not be sufficient, because the caller is not notified gracefully -- only the user inspecting the output is notified. It's gonna be tricky, but I think the best way from the caller's point of view is to emit some kind of truncated CreateResult signaling a pre-mature finish (TruncatedCreateResult, for example) but more string chunks will come. So the caller can handle it appropriately. From the AssistantAgent it can then emit a ModelClientStreamResetEvent to signal a reset of the stream due to error or retry.

It's all too complicated for one PR. Perhaps we can address create_stream separately in the future.

@ekzhu For this one, how do you want to split into separate PRs? Do you want to create the the TruncatedCreateResult. in this one and leave the updates to AssistantAgent to another PR?

lspinheiro avatar Mar 05 '25 01:03 lspinheiro

Hi, is this still being worked on? Getting killed by rate limits today and was hoping someone was working on a handler.

jspv avatar May 30 '25 00:05 jspv

Hi, is this still being worked on? Getting killed by rate limits today and was hoping someone was working on a handler.

Let me try to finish it up.

lspinheiro avatar May 30 '25 04:05 lspinheiro

Codecov Report

Attention: Patch coverage is 74.22680% with 25 lines in your changes missing coverage. Please review.

Project coverage is 79.51%. Comparing base (c683175) to head (80c4629). Report is 69 commits behind head on main.

Files with missing lines Patch % Lines
...togen_ext/models/openai/_error_handler_callback.py 71.59% 25 Missing :warning:
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5615      +/-   ##
==========================================
- Coverage   79.54%   79.51%   -0.04%     
==========================================
  Files         225      226       +1     
  Lines       16669    16766      +97     
==========================================
+ Hits        13260    13332      +72     
- Misses       3409     3434      +25     
Flag Coverage Δ
unittests 79.51% <74.22%> (-0.04%) :arrow_down:

Flags with carried forward coverage won't be shown. Click here to find out more.

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

:rocket: New features to boost your workflow:
  • :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

codecov[bot] avatar May 30 '25 06:05 codecov[bot]