zed icon indicating copy to clipboard operation
zed copied to clipboard

Zed Bedrock AI chat misses streaming in-between updates, only returns full response in a single chunk.

Open zalez opened this issue 9 months ago • 7 comments

Summary

Zed Bedrock AI chat misses streaming in-between updates, only returns full response in a single chunk.

Steps to trigger the problem:

  1. Configure Amazon Bedrock as an AI provider with valid AWS IAM user credential associated with a valid policy.
  2. Make sure you have enabled access to any model available from Amazon Bedrock in your chosen region. I used Claude 3.7 Sonnet, but any other model on Bedrock should behave the same.
  3. Open an AI chat panel, make sure that a Bedrock model is selected, and send a prompt that triggers a longer response, like: "Brainstorm 10 baby metal band names with one paragraph of backstory each."
  4. Observe how the "Assistant" circular spinner spins for 10+ seconds. After a while, the full response appears in the chat panel at once.
  5. Optional: Try the same prompt with the same model from a different provider, like Anthropic for the Claude 3.7 model, and observe how the chat panel renders a continuous stream of updates while the model completes its response.

Actual Behavior: After a long wait time, the model response appears as one, big, final chunk in the AI chat panel.

Expected Behavior: The chat panel should show updates from the model’s streaming completion shortly after submitting the prompt, while the model is completing its response.

I have reproduced the above bug with multiple different models, including Sonnet 3.7 and Amazon Nova Pro. In all cases, streaming updates are missing, and the response comes in as one big chunk at the end. Therefore, this does not seem to be model-specific.

I have also tried the same prompt on the AWS Console Bedrock UI and can see the expected behavior of streaming updates there, so this is not something related to the Amazon Bedrock service, it seems to be related to the Bedrock support in Zed specifically.

Finally, I verified that this happens both in remote mode and in local mode.

Zed Version and System Specs

Zed: v0.176.1 (Zed) OS: macOS 15.3.1 Memory: 32 GiB Architecture: aarch64

zalez avatar Mar 04 '25 12:03 zalez

First of all, love this

Brainstorm 10 baby metal band names with one paragraph of backstory each

Second, thanks for doing such rigorous testing, that helps a bunch.

And third, thanks for reporting!

probably-neb avatar Mar 21 '25 00:03 probably-neb

It's being looked at

cc: @5herlocked

joshrutkowski avatar Mar 21 '25 01:03 joshrutkowski

~~#28137 should sort this out~~

nvm -- working on it still

5herlocked avatar Apr 05 '25 16:04 5herlocked

@probably-neb could probably use your help in debug mode it actually returns a "streaming" response, but in release mode, it returns a single chunk.

Any idea how we can break down where the "speed-up" is affecting user experience?

Flamegraphs haven't been helpful to me -- all it shows is the stream_completion closures being optimized away not taking any time.

5herlocked avatar Apr 07 '25 16:04 5herlocked

Do you have a PR up by any chance? If not could you open one as a draft and ping me? Happy to take a look or send it to the right person

probably-neb avatar Apr 07 '25 21:04 probably-neb

I mean this behaviour is already in preview and stable. I've been trying to isolate it to understand what's causing the blocking behaviour.

5herlocked avatar Apr 07 '25 22:04 5herlocked

but here -- https://github.com/zed-industries/zed/pull/28281

5herlocked avatar Apr 07 '25 22:04 5herlocked