openai-node
openai-node copied to clipboard
createChatCompletion() takes a long time to process.
Describe the bug
As described in the title, the method takes a while to load. This is a big problem because, in Vercel, the timeout limit for any call is 5 seconds. And in Netlify, the limit is 10 seconds. But most often the call takes more than 10 seconds to respond. As a result, my website is not working after refactoring the site to use the new gpt-3.5-trubo model. (It works fine with davinci)
Basically, my website works on localhost but not when I deploy it to any service. Am I missing something? Is there a way to reduce the time?
To Reproduce
const completion = await openai.createChatCompletion({
model: "gpt-3.5-turbo",
messages: [
{
role: 'user',
content: "Write a blog on artificial intelligence",
}
],
});
This takes more than 10 seconds to complete.
Code snippets
No response
OS
Windows 11
Node version
v18.12.1
Library version
3.2.1
same issue
The definition of createChatCompletion is modified from
createImageEdit: async (image: File, mask: File, prompt: string, n?: number ...
to
createImageEdit: async (image: File, prompt: string, mask?: File, n?: number ...
the second param is prompt
, and the mask
turns to be optional.
Sometimes createChatCompletion
hangs indefinitely for me 😕
Sometimes
createChatCompletion
hangs indefinitely for me 😕
It never really hanged indefinitely for me, but I had some 240 seconds timeouts 😅
@eunemic was hanging indefinitely for me sometimes, even with a timeout passed into the options. Seems it has to do with a bad request, e.g. unescaped characters. This is axios behavior. Wrap the call in a try/catch to avoid the hang.
I'm also finding responses from 10736ms to 22304ms using this API, even with shorter questions and prompts. Is this just the nature of how long responses take to generate? Or is there more that can be done to make the response more efficient?
I find the Playground responds much faster to the same prompts.
Could it be that streaming completions aren't supported? Is streaming it what seems faster than waiting for the whole response to be ready?
It would be nice to see some of the many similar questions about the performance of the API be answered by openai... For those of us working on projects that will use the API this is a big risk to pass off as "teething problems" that "will surely be fixed in time" These issues do not seem to be unique to any one person and their implementation, they are widespread. The same delays do not seem to affect the playground.
The speed of gpt-3.5-turbo
has been greatly improved since March.
You may also consider using a smaller max_tokens
to limit the amount of time the model spends creating the response.