gateway
gateway copied to clipboard
Request to stream a chat completion results in buffered response instead of stream
What Happened?
I tried running the gateway locally using npx @portkey-ai/gateway, node run dev:node and node build/start-server.js (Node 18). I noticed that even though I was specifying the "stream" parameter in the config the response was buffered and returned all at once to the client instead of streamed back. I also noticed that this is somehow related to the "compress" middleware step. After removing the following middleware, the response was streamed correctly:
// app.use('*', (c, next) => {
// const runtime = getRuntimeKey();
// if (runtime !== 'lagon' && runtime !== 'workerd') {
// return compress()(c, next);
// }
// return next();
// });
What Should Have Happened?
The response should be streamed to the client.
Relevant Code Snippet
code on the client side
const run = async () => {
const client = new OpenAI({
apiKey: "xxx",
baseURL: "http://127.0.0.1:8787/v1",
defaultHeaders: {
'x-portkey-config': JSON.stringify({
"provider": "openai",
"api_key": "xxx",
})
},
});
const stream = await client.chat.completions.create({
model: "gpt-3.5-turbo",
messages: [{ role: "system", content: "You are a helpful assistant." }, { role: "user", content: "Write a short story about Mumin" }],
stream: true,
max_tokens: 254,
});
for await (const part of stream) {
console.log(part.choices[0]?.delta?.content || "", new Date());
}
}
run()
Your Twitter/LinkedIn
No response
I am also facing same issue here. Removing
// index.ts
app.use('*', (c, next) => {
const runtime = getRuntimeKey();
if (runtime !== 'lagon' && runtime !== 'workerd') {
return compress()(c, next);
}
return next();
});
and removing content-encoding by adding below in updateResponseHeaders
// handlerUtils.ts
response.headers.delete('content-encoding')
Solved it. Any permanent fix available ? @VisargD
Issue can be closed
@au-re thank you for raising the issue and @mohankumarelec for fixing this!! This is amazing. Thanks to @flexchar & @arbi-dev as well.