openai-kotlin
openai-kotlin copied to clipboard
Add headers to response info
Hello. This is a request for adding functionality. I need to have access to the fields in the headers that come with this API since there are some important fields there, for example: x-ratelimit-limit-requests: 3500 x-ratelimit-limit-tokens: 180000 x-ratelimit-remaining-requests: 3499 x-ratelimit-remaining-tokens: 179972 x-ratelimit-reset-requests: 17ms x-ratelimit-reset-tokens: 9ms
I need to have access to this headers data or some other headers. It would be great if you could add additional functions, for example, and they would return Pair<OUR REGULAR RESPONSE, HEADERS>. and will be located next to the usual functions. function1(): Response and function1Extra(): Pair<Response, Headers>.
This indeed can be a good enhancement, I will look into it soon!
played a bit with the code, here is possible way:
- add to interface Chat - function chatCompletionWithHeaders
- implement function in the ChatApi: override suspend fun chatCompletionWithHeaders(request: ChatCompletionRequest): Pair<ChatCompletion, Headers> { val httpResponse: HttpResponse = requester.perform { it.post { url(path = ApiPath.ChatCompletions) setBody(request) contentType(ContentType.Application.Json) }.body() } return httpResponse.body<ChatCompletion>() to httpResponse.headers }
tested that and it works, but maybe you will find some other more abstract way
Upvote! I am also looking for a good way to get access to the rate limit info that is included in the response headers.
+1. Is there any workaround for this while the issue is still unresolved?
Might be able to do something like this:
OpenAI(
"your-open-ai-api-key",
httpClientConfig = {
install(
createClientPlugin("RateLimitPlugin") {
onResponse { response ->
response.headers["x-ratelimit-remaining-tokens"]?.let { remainingTokens ->
response.headers["x-ratelimit-reset-tokens"]?.let { resetTime ->
// Do something with the tokens and reset time
service.storeTokensAndResetTime(remainingTokens, resetTime)
}
}
}
},
)
},
)
And maybe add an extension function to OpenAI
that wraps the actual call and checks the rate limits:
suspend fun OpenAI.chatCompletionWithRateLimitCheck(chatCompletionRequest: ChatCompletionRequest): ChatCompletion {
val (remainingTokens, resetTime) = service.getTokensAndResetTime()
val requestTokenCount = getMessagesTokenCount(chatCompletionRequest.messages, modelId)
if (requestTokenCount > remainingTokens) {
delay(Duration.parse(resetTime).toMillis())
}
return openAIClient.chatCompletion(chatCompletionRequest)
}
suspend fun getMessagesTokenCount(messages: List<ChatMessage>, modelId: String) = messages
.map { Tokenizer.of(model = modelId).encode(it.content ?: "") }
.flatten()
.sumOf { it }
Just ideas - haven't tested any of this.