openai-java icon indicating copy to clipboard operation
openai-java copied to clipboard

About rate limit

Open louisice opened this issue 2 years ago • 3 comments

I found some explanations on how rate limits work at https://platform.openai.com/docs/guides/rate-limits/how-do-rate-limits-work , but I'm still confused. If I make the first successful request at 0ms, and a failed request at 800ms, when will the next request be successful? At 1000ms or 1800ms? (Let's assume the RPM = 60)

louisice avatar Jul 11 '23 10:07 louisice

I'm not 100% sure, but I think the rate limits are always per minute, so you'll be able to send at least 60 before you get an error

TheoKanning avatar Jul 16 '23 01:07 TheoKanning

I'm not 100% sure, but I think the rate limits are always per minute, so you'll be able to send at least 60 before you get an error

thanks for replying dude, I'll run more tests on both azure and openai, then post the result here for who has the same question

louisice avatar Jul 16 '23 02:07 louisice

@louisice if you're on the experimenting mode, consider using that OkHttp Interceptor. the headers you should be interested in are "x-ratelimit-reset-tokens" and "x-ratelimit-reset-requests"

right now i don't think those headers are represented by a model in this library response, so maybe using a ThreadLocal or another mechanism to retrieve them after each request.

private static final class HeaderLogger implements Interceptor
{
  private static final Set<String> header=Set.of("x-ratelimit-limit-requests","x-ratelimit-limit-tokens","x-ratelimit-remaining-requests","x-ratelimit-remaining-tokens","x-ratelimit-reset-requests","x-ratelimit-reset-tokens");
  
  @Override
  public Response intercept(Chain chain) throws IOException
  {
    Response response=chain.proceed(chain.request());
    //
    Iterable<Pair<String,String>> iterable= () -> response.headers().iterator();
    var result=StreamSupport.stream(iterable.spliterator(), false)
        .filter(t -> header.contains(t.component1()))
        .collect(Collectors.toMap(t -> t.component1(), t -> t.component2()));
    //
    log.info("limit headers : {}",result);
    return response;
  }
  
}

Aelentel avatar Nov 25 '23 12:11 Aelentel