No default timeout for requests (unlike TS SDK)
Initial Checks
- [x] I confirm that I'm using the latest version of MCP Python SDK
- [x] I confirm that I searched for my issue in https://github.com/modelcontextprotocol/python-sdk/issues before opening this issue
Description
Spec states: Implementations SHOULD establish timeouts for all sent requests, to prevent hung connections and resource exhaustion.
The TypeScript SDK does have a default timeout of 60 seconds on requests, and the difference between Python & TypeScript is the crux of https://github.com/modelcontextprotocol/typescript-sdk/issues/245 (which also suggests 60 seconds might not be long enough). Also related, TypeScript has a resetTimeoutOnProgress option (defaulting to falst), which would probably be useful if we do introduce a default to the Python SDK.
Example Code
Python & MCP Python SDK
Latest
https://github.com/modelcontextprotocol/python-sdk/pull/1159 adds timeout support
Thanks for opening this — having sensible timeout behavior is critical for production MCP usage, especially when LLMs are orchestrating many tool calls.
From a broader SDK perspective, there are two related gaps:
-
Server-side tool timeouts
- Tool and resource handlers are awaited without any server-side timeout wrapper (e.g.
anyio.fail_after()). - A single hung tool can effectively stall a session unless the client enforces its own timeout and tears down the connection.
- For typical GenAI workflows, agents often retry or fan out tool calls, so one hung tool can easily multiply into many hung requests.
- Tool and resource handlers are awaited without any server-side timeout wrapper (e.g.
-
Retry policy surfaced to clients
- On the client side, we currently have
read_timeout_secondsbut no higher-levelRetryPolicyabstraction that distinguishes between:- transient errors (network, transport issues),
- server-side timeouts,
- non-retryable errors (validation, permission, etc.).
- Without a structured view of error types, LLM agents end up guessing when to retry.
- On the client side, we currently have
Concretely, it might be useful to extend this issue (or create a follow-up) to cover:
- A configurable server-side timeout for tool execution (with a reasonable default, e.g. 30s), returning a clear “timeout” error on the protocol layer.
- A client-side
RetryPolicyor equivalent hook that:- treats timeouts and transient transport issues as retryable (with backoff),
- treats validation/semantic errors as non-retryable by default.
If maintainers are open to this direction, I’d be happy to help draft a more detailed proposal and/or work on a PR that wires in:
- a per-tool execution timeout on the server, and
- a minimal
RetryPolicyon the client with a simple backoff strategy and clear error messages.
This is a valid gap - we should match the TypeScript SDK's behavior here.
What the spec says:
"Implementations SHOULD establish timeouts for all sent requests, to prevent hung connections and resource exhaustion. When the request has not received a success or error response within the timeout period, the sender SHOULD issue a cancellation notification for that request and stop waiting for a response."
This is specifically about client-side request timeouts - how long the client waits for a response from the server.
TypeScript SDK:
- Default
DEFAULT_REQUEST_TIMEOUT_MSEC = 60000(60 seconds) - Per-request
timeoutoption resetTimeoutOnProgressoption
Python SDK currently:
- Has
read_timeout_secondsandrequest_read_timeout_secondsbut no default - No
resetTimeoutOnProgressequivalent
Proposed fix: Add a default timeout of 60 seconds to match TS SDK, with the ability to override per-request. Could also add progress-based timeout reset as a follow-up.