azure-sdk-for-java
azure-sdk-for-java copied to clipboard
[OpenAI] Support token calculation in streaming API
OpenAI starts to support stream_options in the chat completion API.
We should look into how to support it in Azure OpenAI SDK for Java.
curl payload looks like below, it includes the "usage":{"prompt_tokens":11,"completion_tokens":60,"total_tokens":71}
data: {"id":"xxxW","object":"chat.completion.chunk","created":1719256033,"model":"gpt-3.5-turbo-0125","system_fingerprint":null,"choices":[{"index":0,"delta":{"content":"I"},"logprobs":null,"finish_reason":null}],"usage":null}
.......
data: {"id":"xxxW","object":"chat.completion.chunk","created":1719256033,"model":"gpt-3.5-turbo-0125","system_fingerprint":null,"choices":[{"index":0,"delta":{"content":"."},"logprobs":null,"finish_reason":null}],"usage":null}
data: {"id":"xxxW","object":"chat.completion.chunk","created":1719256033,"model":"gpt-3.5-turbo-0125","system_fingerprint":null,"choices":[{"index":0,"delta":{},"logprobs":null,"finish_reason":"stop"}],"usage":null}
data: {"id":"xxxW","object":"chat.completion.chunk","created":1719256033,"model":"gpt-3.5-turbo-0125","system_fingerprint":null,"choices":[],"usage":{"prompt_tokens":11,"completion_tokens":60,"total_tokens":71}}
data: [DONE]