Streaming support in `TextBlock`
While the SDK supports streaming for various message blocks (e.g., tool usage/result, system message, result message, text message etc.), TextBlock content is only delivered once it's fully available. This delays rendering for long messages, even if TTFT is low. Supporting partial streaming of TextBlock content would let applications render text incrementally, improving user experience a lot.
In current streaming guide, we receive entire block at once:
https://github.com/anthropics/claude-code-sdk-python/blob/3010aaf092186a24dae240d78f0ee7d2e0d5d62a/examples/streaming_mode_ipython.py#L26
I encountered the same problem. The text and tool parameters could not get increments. Both the tool and the text needed to wait for a complete return.
same problem here, long wait till a sudden appear, +1 to fix it
Me too:
You can see it's split into chunks - the first chunk ends with "Would you lik" and the second chunk continues with "e to create...". The text is being streamed in 50-character chunks!
However, the chunks appear at the same timestamp because Claude SDK waits for the full response before giving us the TextBlock. The real solution is that Claude Agent SDK itself doesn't provide token-by-token streaming - it only provides message-level streaming.
More from claude code:
Investigation Complete: TextBlock Streaming Issue
I've completed the investigation into why TextBlock doesn't work properly when streaming chunks from the LLM. Here's what I found:
Root Cause
When include_partial_messages=True is enabled, the SDK receives streaming text chunks as StreamEvent objects containing raw Anthropic API events. However, these text deltas are NOT automatically accumulated into TextBlock objects.
The Problem
Current behavior:
- Users receive StreamEvent objects with raw API events
- Text chunks come in content_block_delta events
- Users must manually extract and accumulate text from event.event.get("delta", {}).get("text", "")
Expected behavior:
- Users expect TextBlock objects with accumulated text as it streams in
- Similar to how complete messages work
Code Evidence
In message_parser.py:93-94, TextBlocks are only created from complete assistant messages: case "text": content_blocks.append(TextBlock(text=block["text"]))
But for streaming (message_parser.py:158-169), only raw StreamEvent wrappers are created with no text extraction.
The e2e tests in test_include_partial_messages.py:110-116 show users must manually accumulate: if event.get("type") == "content_block_delta": delta = event.get("delta", {}) if delta.get("type") == "thinking_delta": thinking_deltas.append(delta.get("thinking", ""))
Further tracing in https://github.com/anthropics/claude-agent-sdk-python/blob/main/examples/streaming_mode.py
async def example_basic_streaming():
"""Basic streaming with context manager."""
print("=== Basic Streaming Example ===")
async with ClaudeSDKClient() as client:
print("User: What is 2+2?")
await client.query("What is 2+2?")
# Receive complete response using the helper method
async for msg in client.receive_response():
display_message(msg)
As result that is not supporting streaming:
How client.receive_response() Works
The Chain: client.receive_response() ↓ client.receive_messages() ↓ self._query.receive_messages() # Raw JSON from Claude CLI ↓ parse_message(data) # Parse JSON → Message objects ↓ Yields: AssistantMessage, UserMessage, ResultMessage
Key Implementation: async def receive_response(self): async for message in self.receive_messages(): yield message # ← Yields COMPLETE message objects if isinstance(message, ResultMessage): return # Stop after ResultMessage
What Gets Yielded:
The parse_message() function shows it parses complete blocks:
case "text": content_blocks.append(TextBlock(text=block["text"])) # ← Complete text case "thinking": content_blocks.append(ThinkingBlock(thinking=block["thinking"])) # ← Complete thinking
The Answer to Your Question:
client.receive_response() yields complete message objects, not incremental text tokens. Each iteration gives you:
- AssistantMessage with complete TextBlock.text
- UserMessage with complete tool results
- ResultMessage with final stats
It does NOT yield:
NOT like this (token-by-token):
"H" → "He" → "Hel" → "Hello"
It yields:
Like this (message-by-message):
AssistantMessage(TextBlock(text="Hello! How can I help you today?")) # Complete UserMessage(ToolResultBlock(...)) # Complete ResultMessage(duration=5.2s) # Complete
This is implemented in PR #274 and working.
Tested with Claude Code. Text streams incrementally when using accumulate_streaming_content=True. Before this, TextBlock content only appeared in complete chunks.
See test code and reproduction steps in https://github.com/anthropics/claude-agent-sdk-python/pull/274#issuecomment-3449741394
The PR solves the manual delta tracking problem mentioned here. Would be good to get this merged.