langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Rate limit error

Open gameveloster opened this issue 2 years ago • 6 comments

I'm getting an openai RateLimitError when embedding my chunked texts with "text-embedding-ada-002", which I have rate limited to 8 chunks of <1024 every 15 secs.

openai.error.RateLimitError: Rate limit reached for default-global-with-image-limits in organization org-xxx on requests per min. Limit: 60.000000 / min. Current: 70.000000 / min. Contact [email protected] if you continue to have issues. Please add a payment method to your account to increase your rate limit. Visit https://beta.openai.com/account/billing to add a payment method.

Every 15 seconds, I'm calling this once;

for ...
    search_index.add_texts(texts=chunked[i : i + 8])
    time.sleep(15)

The chunks list chunked was created using

text_splitter = NLTKTextSplitter(chunk_size=1024)
chunked = [chunk for source in sources for chunk in text_splitter.split_text(source) ]

Why is my request rate exceeding 70/min when I'm only embedding at ~32 chunks/min? Does each chunk take more than 1 request to process?

Anyway to better rate limit my embedding queries? Thanks

gameveloster avatar Jan 17 '23 07:01 gameveloster

Perhaps not quite the same scenario, but I'm getting exactly the same error when running the VectorDB Question Answering with Sources example.

Perhaps adding some exponential backoff as OpenAI recommend?

alvaropp avatar Jan 19 '23 13:01 alvaropp

I ran into rate limits when using FAISS.from_texts on one markdown file with ~800 lines with the Question Answering with Sources sample. I worked around it like this. Posting in case it is useful for other users:

def chunks(lst, n):
  # https://stackoverflow.com/a/312464/18903720
  """Yield successive n-sized chunks from lst."""
  for i in range(0, len(lst), n):
    yield lst[i:i + n]

text_chunks = chunks(texts, 20) # adjust 20 based on your average character count per line
docsearch = None
for (index, chunk) in tqdm.tqdm(enumerate(text_chunks)):
  if index == 0:
    docsearch = FAISS.from_texts(texts, embeddings)
  else:
    time.sleep(60) # wait for a minute to not exceed any rate limits
    docsearch.add_texts(chunk)

GaurangTandon avatar Jan 28 '23 18:01 GaurangTandon

Didn't work for me. Did OpenAI change something or am I missing something here?

Can you please help me?

Same for me today with the example at https://python.langchain.com/en/latest/use_cases/code/code-analysis-deeplake.html

Is their a way to integrate a solution to the example code to avoid it?

juliencarponcy avatar May 09 '23 13:05 juliencarponcy

Still having the same issue, I tried something like this: embeddings = OpenAIEmbeddings() vector_store = FAISS.from_texts(texts=["example1", "example2"], embedding=embeddings) and vector_store = Chroma.from_texts(texts=["example1", "example2"], embedding=embeddings)

Got: Retrying langchain.embeddings.openai.embed_with_retry.<locals>._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..

I'm passing a list that has a length of 2, and it is giving me RateLimitError.

Tried two versions of Langchain, 0.0.162 and 0.0.188, and both appeared with the same error.

EricLee911110 avatar Jun 01 '23 19:06 EricLee911110

I am running into the same issue, when using the function:

Chroma.from_texts

Did anyone manage to come up with a solution which gets around the rate limit.

Thinking of looping through texts in try except, and adding a sleep function for when the RateLimit is reached, then retrying.

mahithsc avatar Jun 16 '23 07:06 mahithsc

Any solution?

fullstackwebdev avatar Jun 20 '23 02:06 fullstackwebdev

Is this the same issue you guys are getting?

Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..

getsean avatar Jul 12 '23 18:07 getsean

Is this the same issue you guys are getting?

Retrying langchain.embeddings.openai.embed_with_retry.._embed_with_retry in 4.0 seconds as it raised RateLimitError: You exceeded your current quota, please check your plan and billing details..

yes

ImcLiuQian avatar Sep 06 '23 08:09 ImcLiuQian

@getsean @ImcLiuQian (and any other that get « You exceeded your current quota » in the error message) : this has nothing to do with the original question. please see https://github.com/langchain-ai/langchain/issues/11914 instead.

lefta avatar Oct 17 '23 10:10 lefta

The solution is to implement an exponential backoff or just a simple 10 second wait. Use try, except block and when the exception is hit, simply wait 10 seconds before running the function again.

mahithsc avatar Oct 17 '23 12:10 mahithsc

I ran into rate limits when using FAISS.from_texts on one markdown file with ~800 lines with the Question Answering with Sources sample. I worked around it like this. Posting in case it is useful for other users:

def chunks(lst, n):
  # https://stackoverflow.com/a/312464/18903720
  """Yield successive n-sized chunks from lst."""
  for i in range(0, len(lst), n):
    yield lst[i:i + n]

text_chunks = chunks(texts, 20) # adjust 20 based on your average character count per line
docsearch = None
for (index, chunk) in tqdm.tqdm(enumerate(text_chunks)):
  if index == 0:
    docsearch = FAISS.from_texts(texts, embeddings)
  else:
    time.sleep(60) # wait for a minute to not exceed any rate limits
    docsearch.add_texts(chunk)

Is there a way to do the same for FAISS.from_documents() ?

chintanmehta21 avatar Nov 22 '23 07:11 chintanmehta21

I tried below method and it works for me,

vector_store = <your_vector_store>
documents = loader.load() #any loader that you used
for text in documents:
    vector_store.add_documents([text])

bilalProgTech avatar Jul 24 '24 15:07 bilalProgTech