openai-python icon indicating copy to clipboard operation
openai-python copied to clipboard

feat: allow setting retry delay

Open Guust-Franssens opened this issue 5 months ago • 5 comments

Confirm this is a feature request for the Python library and not the underlying OpenAI API.

  • [X] This is a feature request for the Python library

Describe the feature or improvement you're requesting

Currently, setting the retry delay for the _base client is not easily exposed. End-users have easy control over the max number of retries but not the max_retry_delay and initial retry delay.

https://github.com/openai/openai-python/blob/8ee5f33e8776e4517ef91a1cb2fafb6af2ca9310/src/openai/_base_client.py#L76-L78

Preferably these are exposed and easily settable similar to max_retries in the OpenAI class: https://github.com/openai/openai-python/blob/8ee5f33e8776e4517ef91a1cb2fafb6af2ca9310/src/openai/_client.py#L49-L74

Additional context

No response

Guust-Franssens avatar Feb 20 '24 14:02 Guust-Franssens

Thanks for the report! Ooc, what's your use-case for wanting to adjust those values?

dackerman avatar Feb 20 '24 15:02 dackerman

I also need this, I am using celery scheduling to implement it since the openai lib does not have this. My use-case is that openai api retries most of the time fail and fail and fail...so we dont like the idea to hit and hit and hit openai again again failing and failing, at least We would like to have a retry exponential/fibonacci or something like that to avoid hitting openai apis when they are slow/buggy/ which is a very very often state of the openai api state..

charlyjazz-sprockets avatar Mar 06 '24 14:03 charlyjazz-sprockets

Does it fail with 429 repeatedly? Or another error message? What kind of rate limit are you hitting?

(I ask because the client should be waiting long enough that you wouldn't hit a second 429, so if that's not happening, we need to adjust something).

rattrayalex avatar Mar 06 '24 22:03 rattrayalex

There are rate limits on Azure OpenAI that are based on the service tier that you are using. My specific scenario is using AOAI APIs in offline evaluation.

Here's the error:

openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the ChatCompletions_Create Operation under Azure OpenAI API version 2024-02-15-preview have exceeded call rate limit of your current AIServices S0 pricing tier. Please retry after 1 second. Please contact Azure support service if you would like to further increase the default rate limit.

This is where those constants are used:

_base_client.py:672:

        # Apply exponential backoff, but not more than the max.
        sleep_seconds = min(INITIAL_RETRY_DELAY * pow(2.0, nb_retries), MAX_RETRY_DELAY)

jflam avatar Apr 28 '24 18:04 jflam

@kristapratico in this scenario, can you confirm whether AOAI will respond with retry-after and/or retry-after-ms headers?

rattrayalex avatar May 13 '24 01:05 rattrayalex

@jflam is this using a PTU deployment? Upon 429, I would expect AOAI to return retry-after (and maybe retry-after-ms if this is PTU) which this library will honor. Are you not seeing those headers in the response?

kristapratico avatar May 13 '24 20:05 kristapratico

no it isn't. I eventually figured out that I needed to use the with_retry() method on the LLM to configure this behavior correctly. Since I did this I don't think that this is a problem anymore. I blame the docs for not making this clear. :)

jflam avatar May 13 '24 20:05 jflam