fastapi_mcp icon indicating copy to clipboard operation
fastapi_mcp copied to clipboard

[BUG] MCP session 404 in multi worker production environment

Open sqrt676 opened this issue 6 months ago • 6 comments

Issue: 404 session not found error due to session not shared across instances behind load balancer

Hi,

I've implemented an MCP server and during local testing with the mcp_use client in SSE configuration, everything works as expected.

However, after deploying the code to production, I started encountering a 404 session not found error.

Upon inspecting the logs, I realized that the root cause is that the session is being created on one instance, but since the application is running behind a load balancer, requests are distributed across different instances/pods. This causes the session lookup to fail when subsequent requests hit a different instance than the one where the session was initially created.

Is there any recommended way to persist or share session state across instances, or any guidance for running MCP server behind a load balancer in a multi-instance environment?

Thanks!

sqrt676 avatar Jul 02 '25 11:07 sqrt676

It seems that HTTP+SSE has been deprecated. For more information, you can refer to the Transports - Model Context Protocol specification.

You might be interested in an issue discussing the implementation of Streamable HTTP. The implementation is still in progress and currently only supports processing POST requests with stateless Streamable HTTP transport. Given this, it may not be ready for direct use in your project, although it is being used in my project in testing now. 😊

Another viable approach is to start the transport directly using mcp.server.fastmcp. More details can be found in the GitHub - modelcontextprotocol/python-sdk repository, which you can get start from README.

Edison-A-N avatar Jul 02 '25 14:07 Edison-A-N

Another viable approach is to start the transport directly using mcp.server.fastmcp. More details can be found in the GitHub - modelcontextprotocol/python-sdk repository, which you can get start from README.

Can you elaborate a little bit more on this?

I'm having the same issues as OP (running on AWS ECS behind an ALB), locally it works fantastic but behind the load balancer the session is lost and it doesnt work

dhuesca avatar Jul 03 '25 09:07 dhuesca

Another viable approach is to start the transport directly using mcp.server.fastmcp. More details can be found in the GitHub - modelcontextprotocol/python-sdk repository, which you can get start from README.

Can you elaborate a little bit more on this?

I'm having the same issues as OP (running on AWS ECS behind an ALB), locally it works fantastic but behind the load balancer the session is lost and it doesnt work

I apologies for any confusion in my previous response.

To clarify, if you want to use the streamableHttp transport with FastAPI, I recommend using the Python SDK directly, as my implementation (tadata-org/fastapi_mcp#188) is not yet production-ready.

For now, I suggest using the "convert_openapi_schema_to_mcp_tool" in FastApiMCP.setup_server and running based on streamableHttp transport. Note that the streamableHttp transport is not currently supported, as tracked in this issue: tadata-org/fastapi_mcp#61.

Edison-A-N avatar Jul 05 '25 07:07 Edison-A-N

Yes, It can run when test locally but failed in production environment

SeqCrafter avatar Jul 10 '25 03:07 SeqCrafter

convert_openapi_schema_to_mcp_tool

What is this?

Lqm1 avatar Jul 15 '25 06:07 Lqm1

Yes, It can run when test locally but failed in production environment

Yep! All becuase of multi worker environment, as the load balancer might be transferring request to different instances of the server which are in my opnion called stateful in nature, Hence, the issue.

sqrt676 avatar Jul 15 '25 13:07 sqrt676