[Feature Request] - streaming support for API calls

Open Sebusml opened this issue 1 year ago • 1 comments

WHAT? Add streaming support for API responses.

WHY? Improves user experience for long or slow completions.

Additional requirements

Support TS.
Support serverless JS runtimes such as Cloudflare pages and Vercel.
If this feature requires creating a TS client, consider returning a ReadableStream

REFERENCE OpenAI supports this with Chat Completions and the Assistants API. Reference: https://platform.openai.com/docs/api-reference/streaming

Apr 05 '24 06:04 Sebusml

As a reference, this is how I have to handle OpenAIs Stream responses in my backend and send back to frontend:

  const textStream = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [
      { role: 'system', content: SYSTEM_PROMT },
      { role: 'user', content: userText }
    ],
    stream: true
  }); 
  const encoder = new TextEncoder();
  return new Response(
    new ReadableStream({
      async start(controller) {
        // Logic to handle each chunk from original stream
        for await (const chunk of textStream) {
          // Get content from chunk as of OpenAI API response structure
          const message = chunk.choices[0]?.delta?.content || '';
          controller.enqueue(encoder.encode(message));
        }

        // Close the stream once all chunks are processed
        controller.close();
      },
      cancel() {
        console.log('cancel and abort');
      }
    }),
    {
      headers: {
        'cache-control': 'no-cache',
        'Content-Type': 'text/event-stream'
      }
    }
  );

Ideally, the API should return a ReadableStream, so all I would need to do is wrap it into a response.

Apr 05 '24 06:04 Sebusml