openai-python
openai-python copied to clipboard
feat: allow setting retry delay
Confirm this is a feature request for the Python library and not the underlying OpenAI API.
- [X] This is a feature request for the Python library
Describe the feature or improvement you're requesting
Currently, setting the retry delay for the _base client is not easily exposed. End-users have easy control over the max number of retries but not the max_retry_delay and initial retry delay.
https://github.com/openai/openai-python/blob/8ee5f33e8776e4517ef91a1cb2fafb6af2ca9310/src/openai/_base_client.py#L76-L78
Preferably these are exposed and easily settable similar to max_retries in the OpenAI class: https://github.com/openai/openai-python/blob/8ee5f33e8776e4517ef91a1cb2fafb6af2ca9310/src/openai/_client.py#L49-L74
Additional context
No response
Thanks for the report! Ooc, what's your use-case for wanting to adjust those values?
I also need this, I am using celery scheduling to implement it since the openai lib does not have this. My use-case is that openai api retries most of the time fail and fail and fail...so we dont like the idea to hit and hit and hit openai again again failing and failing, at least We would like to have a retry exponential/fibonacci or something like that to avoid hitting openai apis when they are slow/buggy/ which is a very very often state of the openai api state..
Does it fail with 429 repeatedly? Or another error message? What kind of rate limit are you hitting?
(I ask because the client should be waiting long enough that you wouldn't hit a second 429, so if that's not happening, we need to adjust something).
There are rate limits on Azure OpenAI that are based on the service tier that you are using. My specific scenario is using AOAI APIs in offline evaluation.
Here's the error:
openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded call rate limit of your current AIServices S0 pricing tier. Please retry after 1 second. Please contact Azure support service if you would like to further increase the default rate limit.
This is where those constants are used:
_base_client.py:672:
# Apply exponential backoff, but not more than the max.
sleep_seconds = min(INITIAL_RETRY_DELAY * pow(2.0, nb_retries), MAX_RETRY_DELAY)
@kristapratico in this scenario, can you confirm whether AOAI will respond with retry-after
and/or retry-after-ms
headers?
@jflam is this using a PTU deployment? Upon 429, I would expect AOAI to return retry-after
(and maybe retry-after-ms
if this is PTU) which this library will honor. Are you not seeing those headers in the response?
no it isn't. I eventually figured out that I needed to use the with_retry() method on the LLM to configure this behavior correctly. Since I did this I don't think that this is a problem anymore. I blame the docs for not making this clear. :)