openai-cookbook icon indicating copy to clipboard operation
openai-cookbook copied to clipboard

repeated connect errors for embeddings endpoint

Open stephansturges opened this issue 1 year ago • 11 comments

Since yesterday I'm getting loads of these errors when using the embeddings API:

gaierror(8, 'nodename nor servname provided, or not known')), ClientConnectorError(ConnectionKey(host='api.openai.com', port=443, is_ssl=True

This happens on more than 50% of connection attempts when using the api_request_parallel_processor.py script from this repository. It doesn't seem to matter what frequency I make the requests at (although it's always pretty high in my tests, I'm submitting >300k items to embed).

stephansturges avatar Mar 04 '23 21:03 stephansturges

Nope sorry, digging deep through the logs I spotted this error that probably explains it? ` WARNING:root:Request 224657 failed with error {'message': 'The server is currently overloaded with other requests. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists.', 'type': 'server_error', 'param': None, 'code': None}

`

stephansturges avatar Mar 04 '23 21:03 stephansturges

WARNING:root:Request 224657 failed with error {'message': 'The server is currently overloaded with other requests. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if the error persists.', 'type': 'server_error', 'param': None, 'code': None}

@stephansturges I'm also getting that error repeatedly, which is new for me after a couple months of using the API.

If the error message is true, the OpenAI servers are simply overwhelmed.

OpenAI recommends implementing retry mechanisms with exponential backoff: https://platform.openai.com/docs/guides/rate-limits/retrying-with-exponential-backoff

My code doesn't have the feature yet, it's never been necessary until now as a prudent sleep after every call has been enough, but that's not cutting it anymore so I guess the only solution is to implement it.

alexerhardt avatar Mar 06 '23 08:03 alexerhardt

Implementing exponential backoff with Tenacity is a breeze - in my case it's been a literal 5min edit to my code. Has sorted the issue completely.

alexerhardt avatar Mar 06 '23 09:03 alexerhardt

Cool, thanks for that! Care to make a PR and share it back?

stephansturges avatar Mar 06 '23 10:03 stephansturges

I'm not familiar with the cookbook nor its contribution policy - I found this Issue by searching on Google, but the gist of it is:

import openai
from openai.error import RateLimitError
from tenacity import (
    retry,
    wait_exponential_jitter,
    stop_after_attempt,
    retry_if_exception_type,
)

# There are several exponential back-off methods in Tenacity; I picked the one with jitter
@retry(retry=retry_if_exception_type(RateLimitError), wait=wait_exponential_jitter(initial=1, max=30))
def open_ai_wrapper():
     response = openai.your_choice_of_call()
     # Do whatever
     return value
     

def caller():
    # Your function now handles exponential back-off automatically
    open_ai_wrapper()

You just need to add a decorator on your wrapping function definition. The Tenacity docs are super-clear and easy to follow, and include other decorator options, such as stopping after a set number of retries.

alexerhardt avatar Mar 06 '23 12:03 alexerhardt

Awesome, thanks!

stephansturges avatar Mar 06 '23 13:03 stephansturges

Two things that would be great to add in the parallel embdeddings calculator: exponential backoff for rate limit + token length calculation and automatic limit to the first 8000 tokens for submission.

stephansturges avatar Apr 14 '23 21:04 stephansturges

What do you mean by parallel embeddings calculator, exactly?

ted-at-openai avatar Apr 14 '23 22:04 ted-at-openai

I mean this :) https://github.com/openai/openai-cookbook/blob/main/examples/api_request_parallel_processor.py

It works great, but would be great to limit the request to the max length of tokens that the model can handle (ie 8191 for V2) by taking the first X tokens that get to this length and discarding the rest of the request, and it gets overloaded and fails when the API is getting hammered so implementing this out of the gate might be useful? https://platform.openai.com/docs/guides/rate-limits/error-mitigation> I mean this :)

No worries, I've fixed all of this for my local use case now.

stephansturges avatar Apr 15 '23 09:04 stephansturges

It would be cool if there was a way to share a bucket of JSONL files with OpenAI to get the embeddings calculations done some other way than the API. I'm trying to get them done for about 10M units of content and at the current rate it will take about 20 days to do those. Or an option to run this locally would be fantastic!

stephansturges avatar Apr 19 '23 15:04 stephansturges

Can you create more than 1 account if you have that many?

Sent from my iPhone

On Apr 19, 2023, at 10:25 AM, Stephan Sturges @.***> wrote:



It would be cool if there was a way to share a bucket of JSONL files with OpenAI to get the embeddings calculations done some other way than the API. I'm trying to get them done for about 10M units of content and at the current rate it will take about 20 days to do those. Or an option to run this locally would be fantastic!

— Reply to this email directly, view it on GitHubhttps://github.com/openai/openai-cookbook/issues/182#issuecomment-1514933266, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABS7E4W7P47BV4GZREVUBITXB77XJANCNFSM6AAAAAAVPYC2NY. You are receiving this because you are subscribed to this thread.Message ID: @.***>

AllanSchergerGitHub avatar Apr 19 '23 17:04 AllanSchergerGitHub