openai-node icon indicating copy to clipboard operation
openai-node copied to clipboard

createChatCompletion() takes a long time to process.

Open ikramhasan opened this issue 1 year ago • 1 comments

Describe the bug

As described in the title, the method takes a while to load. This is a big problem because, in Vercel, the timeout limit for any call is 5 seconds. And in Netlify, the limit is 10 seconds. But most often the call takes more than 10 seconds to respond. As a result, my website is not working after refactoring the site to use the new gpt-3.5-trubo model. (It works fine with davinci)

Basically, my website works on localhost but not when I deploy it to any service. Am I missing something? Is there a way to reduce the time?

To Reproduce

 const completion = await openai.createChatCompletion({
       model: "gpt-3.5-turbo",
       messages: [
           {
             role: 'user',
             content: "Write a blog on artificial intelligence",
           }
       ],
      });

This takes more than 10 seconds to complete.

Code snippets

No response

OS

Windows 11

Node version

v18.12.1

Library version

3.2.1

ikramhasan avatar Mar 05 '23 15:03 ikramhasan

same issue

Developeranees avatar Mar 07 '23 15:03 Developeranees

The definition of createChatCompletion is modified from createImageEdit: async (image: File, mask: File, prompt: string, n?: number ... to createImageEdit: async (image: File, prompt: string, mask?: File, n?: number ...

the second param is prompt, and the mask turns to be optional.

zhongkai avatar Mar 15 '23 04:03 zhongkai

Sometimes createChatCompletion hangs indefinitely for me 😕

eumemic avatar Mar 20 '23 03:03 eumemic

Sometimes createChatCompletion hangs indefinitely for me 😕

It never really hanged indefinitely for me, but I had some 240 seconds timeouts 😅

Olovorr avatar Mar 20 '23 11:03 Olovorr

@eunemic was hanging indefinitely for me sometimes, even with a timeout passed into the options. Seems it has to do with a bad request, e.g. unescaped characters. This is axios behavior. Wrap the call in a try/catch to avoid the hang.

imagine avatar Mar 21 '23 17:03 imagine

I'm also finding responses from 10736ms to 22304ms using this API, even with shorter questions and prompts. Is this just the nature of how long responses take to generate? Or is there more that can be done to make the response more efficient?

I find the Playground responds much faster to the same prompts.

Could it be that streaming completions aren't supported? Is streaming it what seems faster than waiting for the whole response to be ready?

patcat avatar Mar 24 '23 10:03 patcat

It would be nice to see some of the many similar questions about the performance of the API be answered by openai... For those of us working on projects that will use the API this is a big risk to pass off as "teething problems" that "will surely be fixed in time" These issues do not seem to be unique to any one person and their implementation, they are widespread. The same delays do not seem to affect the playground.

kitfit-dave avatar Apr 14 '23 02:04 kitfit-dave

The speed of gpt-3.5-turbo has been greatly improved since March.

You may also consider using a smaller max_tokens to limit the amount of time the model spends creating the response.

rattrayalex avatar Jul 10 '23 01:07 rattrayalex