python-sdk icon indicating copy to clipboard operation
python-sdk copied to clipboard

Race Condition in StreamableHTTP Transport Causes ClosedResourceError

Open Edison-A-N opened this issue 3 months ago • 4 comments

Initial Checks

  • [x] I confirm that I'm using the latest version of MCP Python SDK
  • [x] I confirm that I searched for my issue in https://github.com/modelcontextprotocol/python-sdk/issues before opening this issue

Description

Race Condition in StreamableHTTP Transport Causes ClosedResourceError

Description

Starting from v1.12.0, MCP servers in HTTP Streamable mode experience a race condition that causes ClosedResourceError exceptions when requests fail validation early (e.g., due to incorrect Accept headers). This issue is particularly noticeable with fast-failing requests and can be reproduced consistently.

Root Cause Analysis

NOTE: All code references are based on v1.14.0 with anyio==4.10.0.

Execution Flow

  1. Transport Setup: In streamable_http_manager.py line 171, connect() is called, which internally creates a message_router task
  2. Message Router: The message_router enters an async for write_stream_reader loop (line 831 in streamable_http.py)
  3. Checkpoint Yield: The write_stream_reader implementation in anyio.streams.memory.py line 109 calls checkpoint() in the receive() function, yielding control
  4. Request Handling: handle_request() processes the HTTP request
  5. Early Return: If validation fails (e.g., incorrect Accept headers in _handle_post_request at line 323 in streamable_http.py), the request returns immediately
  6. Transport Termination: Back in streamable_http_manager.py line 193, http_transport.terminate() is called, closing all streams including write_stream_reader
  7. Race Condition: The message_router task may still be in the checkpoint() yield and hasn't returned to check the stream state
  8. Error: When the message_router resumes, it continues to receive_nowait() and encounters a closed stream, raising ClosedResourceError in line 93

Code Locations

  • Issue trigger: streamable_http.py:323 - Early return on validation failure
  • Stream creation: streamable_http_manager.py:171 - connect() call
  • Message router: streamable_http.py:831 - async for write_stream_reader
  • Stream termination: streamable_http_manager.py:193 - terminate() call
  • Error source: memory.py:93 - ClosedResourceError in receive_nowait()

Reproduction Steps

  1. Start an MCP server in HTTP Streamable mode
  2. Send a POST request with incorrect Accept headers (missing either application/json or text/event-stream)
  3. The request will fail validation and return quickly
  4. Observe the ClosedResourceError in the server logs

Example Request

curl -X POST http://localhost:8000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json" \
  -d '{"jsonrpc": "2.0", "method": "initialize", "id": 1}'

Error Stack Trace

15:37:00 - mcp.server.streamable_http - ERROR - Error in message router
Traceback (most recent call last):
  File "/data/test_project/.venv/lib/python3.10/site-packages/mcp/server/streamable_http.py", line 831, in message_router
    async for session_message in write_stream_reader:
  File "/data/test_project/.venv/lib/python3.10/site-packages/anyio/abc/_streams.py", line 41, in __anext__
    return await self.receive()
  File "/data/test_project/.venv/lib/python3.10/site-packages/anyio/streams/memory.py", line 111, in receive
    return self.receive_nowait()
  File "/data/test_project/.venv/lib/python3.10/site-packages/anyio/streams/memory.py", line 93, in receive_nowait
    raise ClosedResourceError
anyio.ClosedResourceError

Workaround

Adding a small delay before the early return can mitigate the race condition:

# In streamable_http.py around line 321
import asyncio
await asyncio.sleep(0.1)  # Allow message_router to complete checkpoint

Expected Behavior

The server should handle early request failures gracefully without raising ClosedResourceError exceptions in the message router.

Impact

This issue affects the reliability of MCP servers in HTTP Streamable mode, particularly when clients send malformed requests or when network conditions cause rapid request/response cycles.

Proposed Solution

This issue occurs due to differences in checkpoint handling logic between anyio and the coroutine scheduler being used, combined with the forced checkpoint operation in receive (the reason for this is unclear - if anyone understands, please help supplement, thank you), and further combined with receive_nowait's internal closed state checking causing application exceptions. This is not an exception of the MCP SDK itself.

However, to address this issue, there can be two solution approaches:

  1. Approach One: Add exception handling for anyio.ClosedResourceError in the message router loop:

    try:
        async for session_message in write_stream_reader:
            # ... existing code ...
    except anyio.ClosedResourceError:
        # Simply ignore the error, or optionally add a warning log
        # (though the warning log may not be particularly helpful)
        pass
    

    This approach directly handles the race condition by catching and ignoring the expected exception.

  2. Approach Two: Add explicit delays in request validation functions like _handle_post_request:

    async def _handle_post_request(self, scope: Scope, request: Request, receive: Receive, send: Send) -> None:
        # ... existing code ...
    
        if not (has_json and has_sse):
            response = self._create_error_response(
                ("Not Acceptable: Client must accept both application/json and text/event-stream"),
                HTTPStatus.NOT_ACCEPTABLE,
            )
            await response(scope, receive, send)
            await asyncio.sleep(0.1)  # Allow message_router to complete checkpoint
            return
    
        # Validate Content-Type
        if not self._check_content_type(request):
            response = self._create_error_response(
                "Unsupported Media Type: Content-Type must be application/json",
                HTTPStatus.UNSUPPORTED_MEDIA_TYPE,
            )
            await response(scope, receive, send)
            await asyncio.sleep(0.1)  # Allow message_router to complete checkpoint
            return
    

    This approach makes the request handling more complex and harder to understand, but it prevents future issues where automatic closing of write_stream_reader might be ignored in version updates.

  3. Approach Three: Looking forward to other solutions that may provide better approaches to handle this race condition.

Related Issues

This issue is related to the broader problem described in #1190, where MCP servers in HTTP Streamable mode are broken starting from v1.12.0.

Example Code


Python & MCP Python SDK

python==3.12.0
mcp==1.14.0

Edison-A-N avatar Sep 13 '25 08:09 Edison-A-N

Thanks @Edison-A-N for submitting this issue as well as a PR proposing a fix: fix: handle ClosedResourceError in StreamableHTTP message router #1384

felixweinberger avatar Oct 03 '25 10:10 felixweinberger

This is affecting us as well and bloating logs significantly.

ofek avatar Oct 21 '25 18:10 ofek

Reproducible example here https://github.com/modelcontextprotocol/python-sdk/issues/1190#issuecomment-3429054580

ofek avatar Oct 21 '25 19:10 ofek

Same here, affecting every request as soon as stateless_http=True is set, but does not seems to impact the server working

In the meantime the fix is merged you can disable the logs with this in uvicorn logging.yml:

  mcp.server.streamable_http:
    level: CRITICAL
    handlers: [default]
    propagate: False

Obviously not ideal but better than unreadable bloat

vemonet avatar Nov 13 '25 16:11 vemonet

Closing as duplicate of #1190, which is tracking this issue. Thank you for doing some great root cause analysis, I've copied it in a comment over to #1190.

maxisbey avatar Dec 02 '25 14:12 maxisbey