ai
ai copied to clipboard
'NaN' token counts when using streamText with Azure OpenAI models
Description
I am using streamText with the Azure OpenAI provider for the AI SDK and there models. I use createAzure to create the provider instance. When I try to get the token counts, I get NaN values.(both in API endpoint onFinish() and useChat({onFinish:() => {}}))
log
[0] 👉 azureChat: start save data
[0] 👉 azureChat usage: { promptTokens: NaN, completionTokens: NaN, totalTokens: NaN }
Code example
import { createAzure } from '@ai-sdk/azure';
export const azure = createAzure({
resourceName: 'resource-name',
apiKey: process.env.AZURE_OPENAI_API_KEY,
});
const result = await streamText({
model: azure(selectedModel),
system: systemContent,
messages: strictPayload.messages,
maxTokens: maxTokens,
maxRetries: 3,
onFinish: async ({ text, usage }) => {
appendData.close();
console.log('👉 azureChat: start save data');
console.log('👉 azureChat usage:', usage);
const { promptTokens, completionTokens, totalTokens } = usage;
const lastMessageContent =
strictPayload.messages[strictPayload.messages.length - 1].content;
try {
if (typeof lastMessageContent === 'string') {
await saveChatMessage(
lastMessageContent,
text,
JSON.stringify(strictPayload.messages),
activeConversationId
);
} else {
console.error(
'azureChat: lastMessageContent is not a string:',
lastMessageContent
);
}
if (data?.userId && data?.sessionId) {
await updateUserTokenUsage({
userId: data.userId,
sessionId: data.sessionId,
promptTokens,
completionTokens,
totalTokens,
});
} else {
console.error(
'azureChat: Missing userId or sessionId for token update'
);
}
} catch (error) {
console.error('Error in azureChat onFinish:', error);
}
},
});
// Component
const {
messages,
input,
handleInputChange,
handleSubmit,
setMessages,
stop,
data,
error,
} = useChat({
api: '/api/chat',
body: {
activeConversation,
selectedModel,
selectedToggle: selectedRole,
internet: false,
},
onResponse: () => {
responseEnd.current = false;
if (activeConversation) return;
},
onError: () => {
responseEnd.current = true;
},
onFinish: (message, { usage }) => {
responseEnd.current = true;
refreshConversationList(data);
setAIState((prev) => ({
...prev,
usage,
}));
},
});
Additional context
// package.json
"dependencies": {
"@ai-sdk/anthropic": "^0.0.35",
"@ai-sdk/azure": "^0.0.17",
"@ai-sdk/openai": "^0.0.40",
"ai": "^3.3.0",
"react": "18.3.1",
"react-dom": "18.3.1",
"next": "14.2.5",
...
It seems that token counts for streaming are currently not supported by Azure OpenAI: https://learn.microsoft.com/en-us/answers/questions/1805363/azure-openai-streaming-token-usage
@lgrammel Not only related to Azure, but it's also not working for OpenAI/Google/Mistral, fireworks works when using openai with fireworks basepath... I am sure it is related to the use of a registry for LLM providers.
When I use streamText with the registry...It gives me NaN
model: registry.languageModel('openai:gpt-3.5-turbo'),
{ promptTokens: NaN, completionTokens: NaN, totalTokens: NaN }
But when I use streamText without a registry provider...it provides proper usage
model: openai('gpt-3.5-turbo')
{ promptTokens: 351, completionTokens: 10, totalTokens: 361 }
@trulymittal how are you setting up the providers in the registry? createOpenAI needs to be set up with compatibility: "strict" for streaming usage: https://sdk.vercel.ai/providers/ai-sdk-providers/openai#provider-instance
Double-checked Mistral and it provides the token usage information, try e.g. https://github.com/vercel/ai/blob/main/examples/ai-core/src/stream-text/mistral.ts
@lgrammel apologies, it was my mistake not to be using the strict compatibility mode. Thanks for this.
@lgrammel https://github.com/Azure/azure-rest-api-specs/blob/main/specification/cognitiveservices/data-plane/AzureOpenAI/inference/preview/2024-08-01-preview/inference.yaml#L4075C5-L4075C32
It's coming and is available now with 2024-08-01-preview.
@lgrammel, perhaps parameters could have some small inference service compatibility badges next to them in the docs (like single colored circles with a logo and a tooltip indicating the service name). Had the same issue as @trulymittal but compatibility: "strict" fixed it for me for openai (but others still return NaN).
https://github.com/vercel/ai/pull/3294
@lgrammel It works for text, but when there's content type of image it fails to respond properly.
[main] ⨯ Error: failed to pipe response
[main] at pipeToNodeResponse (/test-azure0.47/node_modules/next/dist/server/pipe-readable.js:126:15)
[main] at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
[main] at async sendResponse (/test-azure0.47/node_modules/next/dist/server/send-response.js:40:13)
[main] at async doRender (/test-azure0.47/node_modules/next/dist/server/base-server.js:1375:25)
[main] at async cacheEntry.responseCache.get.routeKind (/test-azure0.47/node_modules/next/dist/server/base-server.js:1567:28)
[main] at async DevServer.renderToResponseWithComponentsImpl (/test-azure0.47/node_modules/next/dist/server/base-server.js:1475:28)
[main] at async DevServer.renderPageComponent (/test-azure0.47/node_modules/next/dist/server/base-server.js:1901:24)
[main] at async DevServer.renderToResponseImpl (/test-azure0.47/node_modules/next/dist/server/base-server.js:1939:32)
[main] at async DevServer.pipeImpl (/test-azure0.47/node_modules/next/dist/server/base-server.js:914:25)
[main] at async NextNodeServer.handleCatchallRenderRequest (/test-azure0.47/node_modules/next/dist/server/next-server.js:272:17)
[main] at async DevServer.handleRequestImpl (/test-azure0.47/node_modules/next/dist/server/base-server.js:810:17)
[main] at async /test-azure0.47/node_modules/next/dist/server/dev/next-dev-server.js:339:20
[main] at async Span.traceAsyncFn (/test-azure0.47/node_modules/next/dist/trace/trace.js:154:20)
[main] at async DevServer.handleRequest (/test-azure0.47/node_modules/next/dist/server/dev/next-dev-server.js:336:24)
[main] at async invokeRender (/test-azure0.47/node_modules/next/dist/server/lib/router-server.js:173:21)
[main] at async handleRequest (/test-azure0.47/node_modules/next/dist/server/lib/router-server.js:350:24)
[main] at async requestHandlerImpl (/test-azure0.47/node_modules/next/dist/server/lib/router-server.js:374:13)
[main] at async Server.requestListener (/test-azure0.47/node_modules/next/dist/server/lib/start-server.js:141:13) {
[main] [cause]: TypeError: terminated
[main] at Fetch.onAborted (node:internal/deps/undici/undici:10823:53)
[main] at Fetch.emit (node:events:520:28)
[main] at Fetch.terminate (node:internal/deps/undici/undici:9981:14)
[main] at Object.onError (node:internal/deps/undici/undici:10935:38)
[main] at Request.onError (node:internal/deps/undici/undici:2055:31)
[main] at Object.errorRequest (node:internal/deps/undici/undici:1576:17)
[main] at TLSSocket.<anonymous> (node:internal/deps/undici/undici:6050:16)
[main] at TLSSocket.emit (node:events:532:35)
[main] at node:net:339:12
[main] at TCP.done (node:_tls_wrap:659:7)
[main] at TCP.callbackTrampoline (node:internal/async_hooks:130:17) {
[main] [cause]: Error: read ECONNRESET
[main] at TLSWrap.onStreamRead (node:internal/stream_base_commons:218:20)
[main] at TLSWrap.callbackTrampoline (node:internal/async_hooks:130:17) {
[main] errno: -54,
[main] code: 'ECONNRESET',
[main] syscall: 'read'
[main] }
[main] }
[main] }
@leolorenzoluis is this a new issue or was it there before?
@lgrammel It's with the new package. Reverting back to old version works fine, so it has something to do with the new version 0.0.47.
@leolorenzoluis I investigated and it seems to be an issue on the Azure side. When you set stream_options: { include_usage: true } (which is required for token usage when streaming), requests with images never terminate. For now you can stick to v0.0.46 if you use images inputs.
The latest Azure SDK is definitely bugged. Images won't work, had to reverse to v0.0.46.
@lgrammel It would be great if the SDK had better unit tests for critical functionality (such as image attachments, tool calling, etc.) across different providers, because with no clear error other than "Failed to pipe response" (server-side) and "Failed to load resource: net::ERR_INCOMPLETE_CHUNKED_ENCODING" (client-side), this was almost impossible to debug.
same issue here, had to downgrade to 0.0.46
@lgrammel I am facing ERR_INCOMPLETE_CHUNKED_ENCODING 200 (OK) issue with openai provider while generating images. I am currently using "@ai-sdk/openai": "2.0.42" and "ai": "^5.0.26". Can you please help me with this issue?