About rate limit
I found some explanations on how rate limits work at https://platform.openai.com/docs/guides/rate-limits/how-do-rate-limits-work , but I'm still confused. If I make the first successful request at 0ms, and a failed request at 800ms, when will the next request be successful? At 1000ms or 1800ms? (Let's assume the RPM = 60)
I'm not 100% sure, but I think the rate limits are always per minute, so you'll be able to send at least 60 before you get an error
I'm not 100% sure, but I think the rate limits are always per minute, so you'll be able to send at least 60 before you get an error
thanks for replying dude, I'll run more tests on both azure and openai, then post the result here for who has the same question
@louisice if you're on the experimenting mode, consider using that OkHttp Interceptor. the headers you should be interested in are "x-ratelimit-reset-tokens" and "x-ratelimit-reset-requests"
right now i don't think those headers are represented by a model in this library response, so maybe using a ThreadLocal or another mechanism to retrieve them after each request.
private static final class HeaderLogger implements Interceptor
{
private static final Set<String> header=Set.of("x-ratelimit-limit-requests","x-ratelimit-limit-tokens","x-ratelimit-remaining-requests","x-ratelimit-remaining-tokens","x-ratelimit-reset-requests","x-ratelimit-reset-tokens");
@Override
public Response intercept(Chain chain) throws IOException
{
Response response=chain.proceed(chain.request());
//
Iterable<Pair<String,String>> iterable= () -> response.headers().iterator();
var result=StreamSupport.stream(iterable.spliterator(), false)
.filter(t -> header.contains(t.component1()))
.collect(Collectors.toMap(t -> t.component1(), t -> t.component2()));
//
log.info("limit headers : {}",result);
return response;
}
}