openai-kotlin icon indicating copy to clipboard operation
openai-kotlin copied to clipboard

Add headers to response info

Open Krobys opened this issue 1 year ago • 5 comments

Hello. This is a request for adding functionality. I need to have access to the fields in the headers that come with this API since there are some important fields there, for example: x-ratelimit-limit-requests: 3500 x-ratelimit-limit-tokens: 180000 x-ratelimit-remaining-requests: 3499 x-ratelimit-remaining-tokens: 179972 x-ratelimit-reset-requests: 17ms x-ratelimit-reset-tokens: 9ms

I need to have access to this headers data or some other headers. It would be great if you could add additional functions, for example, and they would return Pair<OUR REGULAR RESPONSE, HEADERS>. and will be located next to the usual functions. function1(): Response and function1Extra(): Pair<Response, Headers>.

Krobys avatar Jun 27 '23 15:06 Krobys

This indeed can be a good enhancement, I will look into it soon!

aallam avatar Jun 28 '23 10:06 aallam

played a bit with the code, here is possible way:

  1. add to interface Chat - function chatCompletionWithHeaders
  2. implement function in the ChatApi: override suspend fun chatCompletionWithHeaders(request: ChatCompletionRequest): Pair<ChatCompletion, Headers> { val httpResponse: HttpResponse = requester.perform { it.post { url(path = ApiPath.ChatCompletions) setBody(request) contentType(ContentType.Application.Json) }.body() } return httpResponse.body<ChatCompletion>() to httpResponse.headers }

tested that and it works, but maybe you will find some other more abstract way

Krobys avatar Jun 30 '23 17:06 Krobys

Upvote! I am also looking for a good way to get access to the rate limit info that is included in the response headers.

palhoye avatar Dec 06 '23 21:12 palhoye

+1. Is there any workaround for this while the issue is still unresolved?

Gtofig avatar Mar 04 '24 02:03 Gtofig

Might be able to do something like this:

      OpenAI(
          "your-open-ai-api-key",
          httpClientConfig = {
            install(
                createClientPlugin("RateLimitPlugin") {
                  onResponse { response ->
                    response.headers["x-ratelimit-remaining-tokens"]?.let { remainingTokens ->
                      response.headers["x-ratelimit-reset-tokens"]?.let { resetTime ->
                        // Do something with the tokens and reset time
                        service.storeTokensAndResetTime(remainingTokens, resetTime)
                      }
                    }
                  }
                },
            )
          },
      )

And maybe add an extension function to OpenAI that wraps the actual call and checks the rate limits:

  suspend fun OpenAI.chatCompletionWithRateLimitCheck(chatCompletionRequest: ChatCompletionRequest): ChatCompletion {
    val (remainingTokens, resetTime) = service.getTokensAndResetTime()
    val requestTokenCount = getMessagesTokenCount(chatCompletionRequest.messages, modelId)
    if (requestTokenCount > remainingTokens) {
      delay(Duration.parse(resetTime).toMillis())
    }
    return openAIClient.chatCompletion(chatCompletionRequest)
  }

  suspend fun getMessagesTokenCount(messages: List<ChatMessage>, modelId: String) = messages
      .map { Tokenizer.of(model = modelId).encode(it.content ?: "") }
      .flatten()
      .sumOf { it }

Just ideas - haven't tested any of this.

clamey avatar Apr 04 '24 04:04 clamey