No streaming response for ALB + lambda (I used Dockerfile to build image to push it to ECR)
No streaming response for ALB + lambda (I used Dockerfile to build image to push it to ECR). [Response is chunked: "object":"chat.completion.chunk"] Which step I need to check to solve this problem?
I guess "handler = Mangum(app)" could not handle streaming.
I'm having the same issue. I'm running ALB + lambda and using open-webui as a UI. When I use from curl, it responds with the streaming formatted response, but comes all at once at the end.
curl $OPENAI_BASE_URL/chat/completions
-H "Content-Type: application/json"
-H "Accept: text/event-stream"
-H "Authorization: Bearer $OPENAI_API_KEY"
-d '{
"model": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
"messages": [
{
"role": "user",
"content": "Print out an example of a python program that iterates over webpages and extracts key information"
}
],
"stream": true
}' --no-buffer
I thought this was just an issue with the bedrock aws system, but then I used another program that connected right to aws bedrock (without the gateway) and it got the expected streaming responses from the same models.
i have same questions, all stream get back together.
Interestingly- I was able to get streaming to work with ECS (mine wasn't working on lambda initially). I'm guessing there is some config for the lambda deployment that might need to change, but switching to ECS is a good enough solution for me.