openai-node icon indicating copy to clipboard operation
openai-node copied to clipboard

Does this library distinguish `429 - Too many requests` from `429 - Too many tokens`? [question]

Open mepc36 opened this issue 1 year ago • 11 comments

Describe the bug

Sorry, not a bug - just a question!

The OpenAI docs stipulate that they enforce rate limits in 2 ways: by request, and by -- source here

I'm wondering if this library distinguishes by the two. I don't think it does, because here is an error log I ahve for the 429:

stackError: Request failed with status code 429
    at createError (/usr/src/app/node_modules/axios/lib/core/createError.js:16:15)
    at settle (/usr/src/app/node_modules/axios/lib/core/settle.js:17:12)
    at IncomingMessage.handleStreamEnd (/usr/src/app/node_modules/axios/lib/adapters/http.js:322:11)
    at IncomingMessage.emit (events.js:412:35)
    at IncomingMessage.emit (domain.js:475:12)
    at endReadableNT (internal/streams/readable.js:1333:12)
    at processTicksAndRejections (internal/process/task_queues.js:82:21)message[Request failed with status code 429](javascript:void(0);)

Upon confirmation from a maintainer that it doesn't, I will open a feature request requesting this differentiation. Thank you!

P.S. I'd request a 3rd option for issue submission, a new Question one, in addition to the current Bug and Feature request options.

To Reproduce

N/A

Code snippets

N/A

OS

mac Ventura (13.0.1)

Node version

16.16

Library version

3.0.0

mepc36 avatar May 18 '23 12:05 mepc36

I have absolutely the same bug. I just started to explore this package with my new token. I use latest version of openai (3.2.1) and Node v16.13.2.

I successfully can use deprecated method listEngines, and it returns list of all engines, but requests createCompletion and createChatCompletion always return Request failed with status code 429

nikfakel avatar May 22 '23 22:05 nikfakel

I am having the same issue, I call:

        const completion = await openai.createChatCompletion({
            model: "gpt-3.5-turbo",
            messages: [
                {role: "system", content: "You are a legal expert. You understand all the court documents of Florida."},
                {role: "user", content: aiPrompt}
            ],
        });

And I get Error: Request failed with status code 429 :(

bitfede avatar May 29 '23 19:05 bitfede

Same here, any solution? Always return "To many requests"

  response: {
    status: 429,
    statusText: 'Too Many Requests',

chdzma avatar Jun 10 '23 11:06 chdzma

Seems we are all having the same issue

But it's cloudflare that is limiting it

response: {
    status: 429,
    statusText: 'Too Many Requests',
    headers: {
      date: 'Tue, 13 Jun 2023 11:51:01 GMT',
      'content-type': 'application/json; charset=utf-8',
      'content-length': '206',
      connection: 'close',
      vary: 'Origin',
      'x-request-id': 'REMOVED',
      'strict-transport-security': 'max-age=15724800; includeSubDomains',
      'cf-cache-status': 'DYNAMIC',
      server: 'cloudflare',
      'cf-ray': '7d6a1e8b1b0e43c2-EWR',
      'alt-svc': 'h3=":443"; ma=86400'
    },

redimongo avatar Jun 13 '23 11:06 redimongo

If you get a 429 error, you need to retry it with an exponential back off. You can use the package exponential-backoff for this:

import { backOff } from "exponential-backoff";

function getWeather() {
  return fetch("weather-endpoint");
}

async function main() {
  try {
    const response = await backOff(() => getWeather());
    // process response
  } catch (e) {
    // handle error
  }
}

main();

OpenAI official answer for 429 errors:

We recommend handling these errors using exponential backoff. Exponential backoff means performing a short sleep when a rate limit error is hit, then retrying the unsuccessful request. If the request is still unsuccessful, the sleep length is increased and the process is repeated. This continues until the request is successful or until a maximum number of retries is reached.

https://help.openai.com/en/articles/5955604-how-can-i-solve-429-too-many-requests-errors

nilsreichardt avatar Jun 15 '23 08:06 nilsreichardt

For what it's worth; I seem to be geting 429 errors on the first request I've ever sent as a new user (besides two that got 401s). My usage screen on OpenAI seems far from exhausted

pixelpax avatar Aug 08 '23 02:08 pixelpax

@pixelpax please raise that at https://help.openai.com/ – it's not an SDK issue.

rattrayalex avatar Aug 09 '23 02:08 rattrayalex

@nilsreichardt FYI, the upcoming v4 of this library has auto-retry with exponential backoff baked-in.

rattrayalex avatar Aug 09 '23 02:08 rattrayalex

The API does distinguish, with a type field in the error response which can be 'tokens', 'request', or 'insufficient_quota'. For example:

{
    message: 'Rate limit reached for 10KTPM-200RPM in organization org-xxxx on tokens per min. Limit: 10000 / min. Please try again in 6ms. Contact us through our help center at help.openai.com if you continue to have issues.',
    type: 'tokens',
    param: null,
    code: 'rate_limit_exceeded'
  }

In the v4 of this SDK, you can access that like so:

try {
  await client.chat.completions.create(params);
} catch (err) {
  if (err instanceof OpenAI.RateLimitError) {
    if (err.error.type === 'tokens') {…}
    if (err.error.type === 'request') {…}
  }
}

In the future, we hope to add separate error classes, like TokenRateLimitError, but I can't promise a timeline on that.

rattrayalex avatar Aug 09 '23 02:08 rattrayalex

There is also code === "insufficient_quota" to consider. Has status code 429, too.

nikwen avatar Aug 15 '23 12:08 nikwen

@nilsreichardt FYI, the upcoming v4 of this library has auto-retry with exponential backoff baked-in.

#400

Retries

v4.14.0 (2023-10-25)

matijagrcic avatar Dec 12 '23 23:12 matijagrcic

I'm going to close this issue as we have an example of accessing this information documented here, though I would still like to add independent classes for each error code at some point.

rattrayalex avatar Jul 08 '24 18:07 rattrayalex