docs: clarify Streamable HTTP stateless mode semantics and usage
Summary
This PR improves the documentation for Streamable HTTP stateless_http mode in the Python SDK by clarifying what "stateless" means in the MCP context, which features are impacted, and when this mode is appropriate to use.
Motivation
Several production deployments rely on Streamable HTTP transport, and stateless mode is particularly attractive for serverless and load-balanced environments. However, the current documentation doesn't fully explain:
- What "stateless" actually means in terms of session lifecycle
- Which MCP features are unavailable or behave differently in stateless mode
- When to choose stateless vs stateful operation
- How to deploy stateless servers effectively
This directly addresses the concerns raised in issue #1696 about unclear semantics, limitations, and naming.
Changes
Added: Comprehensive "Understanding Stateless Mode" Section
Located after the Streamable HTTP transport examples, this new documentation includes:
1. Clear Definition
- Explains the per-request session model used in stateless mode
- Contrasts with stateful mode's persistent sessions
- Details what happens at the transport level
2. Feature Compatibility Table
Shows which MCP features work in each mode:
| Feature | Stateful | Stateless |
|---|---|---|
| Server Notifications | ✅ | ❌ |
| Resource Subscriptions | ✅ | ❌ |
| Multi-turn Context | ✅ | ❌ |
| Long-running Tools | ✅ | ⚠️ |
| Tools/Resources/Prompts | ✅ | ✅ |
| Concurrent Requests | ⚠️ | ✅ |
With helpful footnotes explaining the limitations.
3. Usage Guidance
Stateless mode is ideal for:
- Serverless deployments (AWS Lambda, Cloud Functions, etc.)
- Load-balanced multi-node deployments without sticky sessions
- Stateless APIs where each request is self-contained
- High concurrency scenarios
- Simplified operations without session management
Stateful mode is needed for:
- Server notifications and subscriptions
- Multi-turn conversation state
- Long-running operations with progress updates
- Connection resumability
4. Three Deployment Patterns
- Pure Stateless (recommended for serverless/auto-scaling)
- Stateful with Sticky Sessions (notifications + load balancing)
- Hybrid Approach (both modes side-by-side)
Each with code examples.
5. Technical Details
- Step-by-step session lifecycle in stateless mode
- Performance characteristics (initialization overhead, memory efficiency, scalability)
- Stateless mode design checklist
6. Updated Introduction
Changed the blanket recommendation for stateless mode to a more nuanced statement that directs users to the new section for guidance.
Type of Change
- [x] Documentation only
- [ ] Bug fix
- [ ] New feature
- [ ] Breaking change
Testing
- [x] Documentation builds without errors
- [x] Markdown formatting is correct
- [x] Internal links work properly
- [x] Code examples are syntactically valid
Notes
- This is a documentation-only change; no behavior is modified
- Examples reference existing code in
examples/servers/simple-streamablehttp-stateless/ - If maintainers prefer alternative terminology (e.g., "ephemeral mode"), I'm happy to update accordingly
Checklist
- [x] My code follows the project's style guidelines
- [x] I have performed a self-review of my own changes
- [x] I have commented my code where necessary
- [x] My changes generate no new warnings
- [x] Any dependent changes have been merged and published
Closes #1696