Zed Bedrock AI chat misses streaming in-between updates, only returns full response in a single chunk.
Summary
Zed Bedrock AI chat misses streaming in-between updates, only returns full response in a single chunk.
Steps to trigger the problem:
- Configure Amazon Bedrock as an AI provider with valid AWS IAM user credential associated with a valid policy.
- Make sure you have enabled access to any model available from Amazon Bedrock in your chosen region. I used Claude 3.7 Sonnet, but any other model on Bedrock should behave the same.
- Open an AI chat panel, make sure that a Bedrock model is selected, and send a prompt that triggers a longer response, like: "Brainstorm 10 baby metal band names with one paragraph of backstory each."
- Observe how the "Assistant" circular spinner spins for 10+ seconds. After a while, the full response appears in the chat panel at once.
- Optional: Try the same prompt with the same model from a different provider, like Anthropic for the Claude 3.7 model, and observe how the chat panel renders a continuous stream of updates while the model completes its response.
Actual Behavior: After a long wait time, the model response appears as one, big, final chunk in the AI chat panel.
Expected Behavior: The chat panel should show updates from the model’s streaming completion shortly after submitting the prompt, while the model is completing its response.
I have reproduced the above bug with multiple different models, including Sonnet 3.7 and Amazon Nova Pro. In all cases, streaming updates are missing, and the response comes in as one big chunk at the end. Therefore, this does not seem to be model-specific.
I have also tried the same prompt on the AWS Console Bedrock UI and can see the expected behavior of streaming updates there, so this is not something related to the Amazon Bedrock service, it seems to be related to the Bedrock support in Zed specifically.
Finally, I verified that this happens both in remote mode and in local mode.
Zed Version and System Specs
Zed: v0.176.1 (Zed) OS: macOS 15.3.1 Memory: 32 GiB Architecture: aarch64
First of all, love this
Brainstorm 10 baby metal band names with one paragraph of backstory each
Second, thanks for doing such rigorous testing, that helps a bunch.
And third, thanks for reporting!
It's being looked at
cc: @5herlocked
~~#28137 should sort this out~~
nvm -- working on it still
@probably-neb could probably use your help in debug mode it actually returns a "streaming" response, but in release mode, it returns a single chunk.
Any idea how we can break down where the "speed-up" is affecting user experience?
Flamegraphs haven't been helpful to me -- all it shows is the stream_completion closures being optimized away not taking any time.
Do you have a PR up by any chance? If not could you open one as a draft and ping me? Happy to take a look or send it to the right person
I mean this behaviour is already in preview and stable. I've been trying to isolate it to understand what's causing the blocking behaviour.
but here -- https://github.com/zed-industries/zed/pull/28281