fastmcp icon indicating copy to clipboard operation
fastmcp copied to clipboard

New lifespan for every session?

Open strawgate opened this issue 8 months ago • 12 comments

It seems like maybe the server is invoking a new lifespan for every sse connection?

I think it should only be invoking the lifespan once but providing the yielded objects in the context to each request?

strawgate avatar Apr 15 '25 05:04 strawgate

Can you please provide an MRE? I am not sure if that is behavior caused by FastMCP or MCP (or both, if a FastMCP client is dynamically invoking a low level server). FastMCP's only interaction with the lifespan object is to pass it directly to the low-level server.

jlowin avatar Apr 15 '25 14:04 jlowin

@asynccontextmanager
async def lifespan(app):
    yield "lifespan"

mcp = FastMCP(name="Parent MCP Server", lifespan=lifespan)

if __name__ == "__main__":
    mcp.run(transport="sse")

Run via sse, set a breakpoint in the lifespan context manager and then open MCP Inspector and connect. The breakpoint will not hit after running the server.

The breakpoint will hit when you connect, and every time you press reconnect.

i.e. it looks like the lifespan is only used when a user connects and is not actually for server startup / shutdown?

strawgate avatar Apr 15 '25 15:04 strawgate

I may have lied about it not hitting after starting the server, I am seeing it hit when starting the server and for each connection.


    def sse_app(self) -> Starlette:
        """Return an instance of the SSE server app."""
        sse = SseServerTransport(self.settings.message_path)

        async def handle_sse(request: Request) -> None:
            async with sse.connect_sse(
                request.scope,
                request.receive,
                request._send,  # type: ignore[reportPrivateUsage]
            ) as streams:
                await self._mcp_server.run(
                    streams[0],
                    streams[1],
                    self._mcp_server.create_initialization_options(),
                )

        return Starlette(
            debug=self.settings.debug,
            routes=[
                Route(self.settings.sse_path, endpoint=handle_sse),
                Mount(self.settings.message_path, app=sse.handle_post_message),
            ],
        )

It looks like mcp_server.run() is calling run for every sse session and every call to run invokes the lifespan context manager again.

It looks like this is the intended behavior, I would have expected it to be called a server session or something similar instead of a server but it seems like if we want to share things, like connections to databases, across sessions the lifespan is not the way to do that and that the lifespan should only be used for per-session dependencies

strawgate avatar Apr 15 '25 23:04 strawgate

The same thing happened. When I stop and restart the client service, the lifecontext method of the MCP server is called multiple times. It's called on the first session connection, not on server startup. Is this intended, for example, the DB instance initialisation code is performed multiple times depending on the client state after server startup? Where should i place the code of init instances? Thanks.

bldev2473 avatar Apr 21 '25 09:04 bldev2473

I think this is just how MCP is designed to work, it's supposed to create a new server for every session.

I think if you want to initialize a singleton db instance you'd do it before you call run on the mcp server instead of in the context. Depending on how you structure your server you could use the new MCPMixin to make a class that you can initialize and register tools after initialization or you could put a global db var at the root of your server.py and make an asyncontextmanager for the database server setup that writes the db instance to the global var for your tools to use

strawgate avatar Apr 21 '25 15:04 strawgate

looking at this again this is not actually implemented by the MCP SDK but instead is part of FastMCP starting a new server for every SSE connection.

So perhaps this is back to @jlowin -- is it possible to not create a unique server for every SSE connection and then the lifespan goes back to being server start/shutdown instead of per-SSE session?

strawgate avatar Apr 21 '25 15:04 strawgate

As far as I'm aware, this is just how the low-level protocol was designed (and FastMCP re-exposes), see e.g. https://github.com/modelcontextprotocol/python-sdk/blob/b4c7db6a50a5c88bae1db5c1f7fba44d16eebc6e/examples/servers/simple-tool/mcp_simple_tool/server.py#L61. I'm not sure if the server can disambiguate between streams from multiple clients at once? Maybe the new streamable HTTP solves this?

I'm more than happy to add an alternative aproach in FastMCP if it's more performant I'm just not sure if the low-level supports it.

jlowin avatar Apr 23 '25 19:04 jlowin

Here's a thought: we could expose the current lifespan param as a low-level one that is invoked with each session and also expose a new lifespan that is only invoked when the FastMCP insanxe itself is run. So for your use case you'd use the new param.

jlowin avatar Apr 24 '25 03:04 jlowin

@strawgate any thoughts on what a good naming convention would be for clarity? e.g. lifespan / fastmcp_lifespan or maybe session_lifespan / server_lifespan (where the current lifespan becomes session lifespan and the new one would be server lifespan). There's a piece of me that's slightly nervous to expose what amounts to an implementation detail of the Python SDK as a public kwarg, maybe it's worth seeing what the final stateless HTTP implementation looks like, though the current PRs also call mcp.run() for each new session.

jlowin avatar Apr 24 '25 18:04 jlowin

I was thinking session_lifespan and server_lifespan but given that each sse is its own server it's a bit confusing to call them that.

I was trying to think if this was something we could solve with a contrib module -- with a FastMCP wrapper that takes it's own lifespan and then we don't muddy the base implementation

strawgate avatar Apr 25 '25 02:04 strawgate

It might be an appropriate use of contrib in the sense that it doesn't "amberize" the implementation but still permits it to be available (in case the upstream SDK changes this in the near future for HTTP).

jlowin avatar Apr 25 '25 15:04 jlowin

For the db connection that's initialized in the shared context to be useful it has to be available in the normal context and I don't think there's a straightforward way to do that in a wrapper.

The Context is a generic which keeps the type returned from the lifespan via LifespanContextT.

I think for now we should just offer an example of doing this yourself (by initializing shared resources via context managers and then using those to build the lifespans) and maybe rename/document the lifespan as being a session lifespan

strawgate avatar Apr 27 '25 16:04 strawgate

@jlowin given the lack of clarity on lifespans, what is the current recommended way to manage global long-lived resources with non-trivial startup times (e.g. database connection pools)?

In my scenario, I have a few independent MCPs that are all mounted to an "index_mcp", which I run in stateless mode and expose by mounting to an existing FastAPI server. Some of these underlying MCPs need to instantiate connection pools, caches, etc. It's not feasible to have these inits run on every request, as is the default behavior of lifespans in stateless mode.

iveshenry18 avatar Jun 05 '25 00:06 iveshenry18

Let's make a contrib module which has a subclass of FastMCP, which takes an async context manager and enters it when the user calls any of the run methods. I think we can throw a not implemented error if the user tries mounting or importing.

Perhaps we can make it so that whatever the async context manager returns gets put in a dummy function and provided to the underlying server as the LifespanResultT

strawgate avatar Oct 04 '25 02:10 strawgate

Claude finished @strawgate's task —— View job


Investigation Complete: Lifespan Behavior in FastMCP

Recommendation

Create a contrib module ServerLifespanMixin that provides true server-level lifespan management for shared resources like database connection pools. The current lifespan parameter works as designed by the upstream MCP SDK (per-session), but this doesn't meet the common use case of managing long-lived shared resources.

The contrib module should:

  1. Accept an async context manager for server-wide initialization
  2. Enter it once when any run() method is called
  3. Make the context available to per-session lifespans
  4. Throw NotImplementedError for mounting/importing operations
  5. Be clearly documented as experimental/contrib

Key Findings

Findings: Current Behavior is Upstream SDK Design

How Lifespan Currently Works

The behavior described in this issue is intentional design from the upstream MCP SDK, not a FastMCP bug:

  1. Per-Session Invocation: In src/fastmcp/server/http.py:158, for every SSE connection:

    async def handle_sse(scope: Scope, receive: Receive, send: Send) -> Response:
        async with sse.connect_sse(scope, receive, send) as streams:
            await server._mcp_server.run(  # New server.run() for each connection
                streams[0], streams[1],
                server._mcp_server.create_initialization_options(),
            )
    
  2. Upstream SDK Pattern: The MCP SDK's Server.run() method enters the lifespan context for each call. This is documented in their examples (e.g., examples/snippets/servers/lowlevel/lifespan.py).

  3. Streamable HTTP Same Behavior: The StreamableHTTPSessionManager also creates a new server instance per session.

Why This Causes Problems

Multiple users report issues:

  • #1808 (@LeoWanty): "I'm looking for... a way to use clean lifespan Context (avoid global variables or special tool to open / close connection)"
  • #166 (@iveshenry18): "I have a few independent MCPs... Some of these underlying MCPs need to instantiate connection pools, caches, etc. It's not feasible to have these inits run on every request"
  • #775, #1115: Similar complaints about lifespan being invoked per-session rather than once at startup

Current Workarounds

Users currently must initialize shared resources outside the MCP lifecycle:

  1. Global Variables (from @strawgate's examples):

    searxng_client = None
    
    @asynccontextmanager
    async def lifespan(mcp: FastMCP):
        global searxng_client
        searxng_client = httpx.AsyncClient(...)
        yield
        await searxng_client.aclose()
    
  2. Class-Based Servers (using MCPMixin contrib):

    class MyServer(MCPMixin):
        def __init__(self):
            self.db = Database()  # Initialize before MCP
            super().__init__()
    
  3. Async Main Pattern:

    async def main():
        db = await Database.connect()
        mcp = FastMCP("server")
        # Pass db via closure or global
        await mcp.run_async()
    
Findings: Why the Current lifespan Parameter Exists

FastMCP's lifespan parameter serves a specific purpose:

  1. Type-Safe Context Injection: It allows tools to receive typed context via Context[ServerSession, LifespanResultT]
  2. Per-Session Resources: Useful for resources that should be session-scoped (e.g., user-specific auth tokens, session state)
  3. SDK Compatibility: FastMCP directly wraps the low-level SDK's lifespan mechanism

From src/fastmcp/server/server.py:106-122:

def _lifespan_wrapper(
    app: FastMCP[LifespanResultT],
    lifespan: Callable[...],
) -> Callable[...]:
    @asynccontextmanager
    async def wrap(s: LowLevelServer[LifespanResultT]) -> AsyncIterator[LifespanResultT]:
        async with AsyncExitStack() as stack:
            context = await stack.enter_async_context(lifespan(app))
            yield context
    return wrap

This wrapper is passed directly to the low-level MCP SDK server, which invokes it per-session.

Findings: Previous Discussion and Maintainer Position

From the issue timeline:

  1. @jlowin (2025-04-23): "As far as I'm aware, this is just how the low-level protocol was designed... I'm more than happy to add an alternative approach in FastMCP if it's more performant I'm just not sure if the low-level supports it."

  2. Naming Discussion (2025-04-24): Considered lifespan / fastmcp_lifespan or session_lifespan / server_lifespan, but concern about "exposing what amounts to an implementation detail of the Python SDK as a public kwarg"

  3. Contrib Module Suggestion (2025-04-25):

    • @strawgate: "I was trying to think if this was something we could solve with a contrib module"
    • @jlowin: "It might be an appropriate use of contrib in the sense that it doesn't 'amberize' the implementation"
  4. Latest Request (2025-10-04 @strawgate): "Let's make a contrib module which has a subclass of FastMCP, which takes an async context manager and enters it when the user calls any of the run methods."

Detailed Action Plan

Step-by-Step Implementation Guide

1. Create the Contrib Module File

Create src/fastmcp/contrib/server_lifespan.py:

"""
Server-level lifespan management for FastMCP.

This module provides a FastMCP subclass that manages server-wide resources
(like database connection pools) that should persist across multiple client sessions.

Note: This is experimental. The upstream MCP SDK's lifespan is designed to be
per-session. This contrib module works around that limitation for specific use cases.
"""

from collections.abc import AsyncIterator
from contextlib import asynccontextmanager, AsyncExitStack
from typing import TypeVar, Generic, Callable
from contextlib import AbstractAsyncContextManager

from fastmcp import FastMCP

ServerContextT = TypeVar("ServerContextT")


class ServerLifespanMixin(FastMCP[ServerContextT], Generic[ServerContextT]):
    """
    FastMCP subclass with server-wide lifespan management.
    
    Use this when you need to initialize expensive resources (like database
    connection pools) once when the server starts, rather than for each session.
    
    Example:
        ```python
        @asynccontextmanager
        async def server_lifespan():
            db_pool = await create_db_pool()
            try:
                yield db_pool
            finally:
                await db_pool.close()
        
        mcp = ServerLifespanMixin("My Server", server_lifespan=server_lifespan)
        
        @mcp.tool()
        def query(ctx: Context[ServerSession, DBPool]) -> str:
            # ctx.request_context.lifespan_context is the db_pool
            return ctx.request_context.lifespan_context.query()
        
        mcp.run()
        ```
    
    Limitations:
        - Cannot use mount() or import_server() - raises NotImplementedError
        - Server lifespan is entered once per run() call, not per session
    """
    
    def __init__(
        self,
        name: str | None = None,
        *,
        server_lifespan: Callable[[], AbstractAsyncContextManager[ServerContextT]],
        **kwargs
    ):
        """
        Initialize with server-wide lifespan.
        
        Args:
            name: Server name
            server_lifespan: Async context manager for server-wide resources
            **kwargs: Other FastMCP arguments (except 'lifespan')
        """
        if "lifespan" in kwargs:
            raise ValueError(
                "Cannot specify both 'lifespan' and 'server_lifespan'. "
                "Use 'server_lifespan' for server-wide resources."
            )
        
        self._server_lifespan_cm = server_lifespan
        self._server_context: ServerContextT | None = None
        self._server_lifespan_stack: AsyncExitStack | None = None
        
        # Create a per-session lifespan that returns the server context
        @asynccontextmanager
        async def session_lifespan(app: FastMCP) -> AsyncIterator[ServerContextT]:
            # Return the already-initialized server context
            if self._server_context is None:
                raise RuntimeError(
                    "Server lifespan not initialized. This should not happen."
                )
            yield self._server_context
        
        super().__init__(name=name, lifespan=session_lifespan, **kwargs)
    
    def mount(self, *args, **kwargs):
        """Mounting is not supported with ServerLifespanMixin."""
        raise NotImplementedError(
            "mount() is not supported with ServerLifespanMixin. "
            "Server-wide lifespan management is incompatible with mounting."
        )
    
    def import_server(self, *args, **kwargs):
        """Importing is not supported with ServerLifespanMixin."""
        raise NotImplementedError(
            "import_server() is not supported with ServerLifespanMixin. "
            "Server-wide lifespan management is incompatible with importing."
        )
    
    async def _enter_server_lifespan(self):
        """Enter the server-wide lifespan context."""
        if self._server_lifespan_stack is not None:
            # Already entered
            return
        
        self._server_lifespan_stack = AsyncExitStack()
        self._server_context = await self._server_lifespan_stack.enter_async_context(
            self._server_lifespan_cm()
        )
    
    async def _exit_server_lifespan(self):
        """Exit the server-wide lifespan context."""
        if self._server_lifespan_stack is not None:
            await self._server_lifespan_stack.aclose()
            self._server_lifespan_stack = None
            self._server_context = None
    
    async def run_async(self, *args, **kwargs):
        """Run the server with server-wide lifespan management."""
        await self._enter_server_lifespan()
        try:
            await super().run_async(*args, **kwargs)
        finally:
            await self._exit_server_lifespan()
    
    def run(self, *args, **kwargs):
        """Run the server with server-wide lifespan management (sync wrapper)."""
        import asyncio
        
        async def _run():
            await self._enter_server_lifespan()
            try:
                # Call the parent's run() which will handle the event loop
                return super(ServerLifespanMixin, self).run(*args, **kwargs)
            finally:
                await self._exit_server_lifespan()
        
        # Note: This is tricky because run() is sync but needs async context
        # We need to ensure the lifespan is managed properly
        # For now, we'll require users to use run_async() or handle manually
        raise NotImplementedError(
            "Use run_async() instead of run() with ServerLifespanMixin, "
            "or manually manage the lifespan with _enter_server_lifespan() and _exit_server_lifespan()"
        )

2. Update src/fastmcp/contrib/__init__.py

Add the new export:

from fastmcp.contrib.server_lifespan import ServerLifespanMixin

__all__ = [...existing..., "ServerLifespanMixin"]

3. Create Tests

Create tests/contrib/test_server_lifespan.py:

import pytest
from contextlib import asynccontextmanager

from fastmcp.contrib.server_lifespan import ServerLifespanMixin
from fastmcp import Context


class TestServerLifespanMixin:
    async def test_server_lifespan_basic(self):
        """Test that server lifespan is entered once and shared across sessions."""
        enter_count = 0
        exit_count = 0
        
        @asynccontextmanager
        async def server_lifespan():
            nonlocal enter_count, exit_count
            enter_count += 1
            yield {"db": "connection"}
            exit_count += 1
        
        mcp = ServerLifespanMixin("Test", server_lifespan=server_lifespan)
        
        @mcp.tool()
        def get_db(ctx: Context) -> str:
            return str(ctx.request_context.lifespan_context)
        
        # TODO: Test that enter_count is 1 after starting server
        # TODO: Test that multiple tool calls use the same context
        # TODO: Test that exit_count is 1 after stopping server
    
    async def test_mount_raises_error(self):
        """Test that mounting raises NotImplementedError."""
        @asynccontextmanager
        async def server_lifespan():
            yield {}
        
        mcp = ServerLifespanMixin("Test", server_lifespan=server_lifespan)
        other = ServerLifespanMixin("Other", server_lifespan=server_lifespan)
        
        with pytest.raises(NotImplementedError, match="mount.*not supported"):
            mcp.mount(other)
    
    async def test_import_server_raises_error(self):
        """Test that importing raises NotImplementedError."""
        @asynccontextmanager
        async def server_lifespan():
            yield {}
        
        mcp = ServerLifespanMixin("Test", server_lifespan=server_lifespan)
        other = ServerLifespanMixin("Other", server_lifespan=server_lifespan)
        
        with pytest.raises(NotImplementedError, match="import_server.*not supported"):
            mcp.import_server(other)

4. Create Documentation

Create docs/contrib/server-lifespan.mdx:

---
title: Server Lifespan
description: Managing server-wide resources across sessions
---

## Overview

The `ServerLifespanMixin` provides true server-level lifespan management for resources that should persist across multiple client sessions, such as database connection pools, caches, or API clients.

<Warning>
This is an experimental contrib module. The upstream MCP SDK's lifespan is designed to be per-session. This module works around that limitation for specific use cases where you need server-wide resource management.
</Warning>

## Problem

By default, FastMCP's `lifespan` parameter is invoked **for every client session** (not once at server startup). This is the intended behavior of the upstream MCP SDK, but causes problems when:

- Initializing expensive resources like database connection pools
- Managing shared caches across sessions
- Setting up resources with non-trivial startup time

## Solution

Use `ServerLifespanMixin` instead of `FastMCP`:

```python
from contextlib import asynccontextmanager
from fastmcp.contrib import ServerLifespanMixin
from fastmcp import Context

@asynccontextmanager
async def server_lifespan():
    """Initialize server-wide resources."""
    db_pool = await create_db_connection_pool()
    try:
        yield db_pool
    finally:
        await db_pool.close()

mcp = ServerLifespanMixin("My Server", server_lifespan=server_lifespan)

@mcp.tool()
def query_database(query: str, ctx: Context) -> str:
    """Query the database using the shared connection pool."""
    db_pool = ctx.request_context.lifespan_context
    return db_pool.execute(query)

if __name__ == "__main__":
    # Server lifespan is entered once here
    await mcp.run_async()

Limitations

  • Cannot mount or import: mount() and import_server() raise NotImplementedError
  • Use run_async(): The sync run() method is not supported
  • Experimental: This pattern may change if the upstream SDK modifies its lifespan behavior

When NOT to Use This

Use the standard FastMCP lifespan if you need:

  • Per-session resources (e.g., user-specific auth tokens)
  • Resources that should be isolated between sessions
  • Server composition via mounting

Related Issues


#### 5. Update Main Documentation

Add a note to `docs/servers/server.mdx` under the `lifespan` parameter:

```mdx
<ParamField body="lifespan" type="AsyncContextManager | None">
  An async context manager function for **per-session** startup and shutdown logic.
  
  <Warning>
  This lifespan is invoked **for every client session**, not once at server startup.
  For server-wide resource management (like database connection pools), see the
  [ServerLifespanMixin](/contrib/server-lifespan) contrib module.
  </Warning>
</ParamField>

Related Items

Related Issues and Pull Requests
Repository Item Relevance
jlowin/fastmcp #166: New lifespan for every session? This issue - documents the problem
jlowin/fastmcp #1808: Managing connections through lifespan User asking for exactly this feature
jlowin/fastmcp #1835: Support an application-level state object Related request for server-wide state
jlowin/fastmcp #775: fastmcp uses lifespan, but lifespan does not take effect Closed - user confusion about per-session behavior
jlowin/fastmcp #1115: Lifespan's startup and shutdown sections are executed for every tool call Closed - another report of the same behavior
jlowin/fastmcp #311: Proxied servers have even weird-er lifespans Open - related lifespan complexity with proxies
jlowin/fastmcp #129: Enter mounted app lifespans Merged - fixed sub-MCP lifespan management
jlowin/fastmcp #176: Refactor MCP composition Merged - major refactor that addressed some lifespan issues
Related Files in FastMCP
File Relevance Key Sections
src/fastmcp/server/server.py Core server implementation with lifespan wrapper 106-122 (lifespan wrapper), 125-189 (FastMCP.init)
src/fastmcp/server/http.py SSE/HTTP transport that calls run() per session 156-163 (handle_sse), 240-249 (create_streamable_http_app)
src/fastmcp/contrib/mcp_mixin.py Example of class-based pattern for resource management Entire file - shows alternative pattern
docs/servers/server.mdx Documentation of lifespan parameter 47-49
Related Files in Upstream MCP SDK
File Relevance
src/mcp/server/lowlevel/server.py Server.run() method that enters lifespan
examples/snippets/servers/lowlevel/lifespan.py Official example showing per-session lifespan usage
src/mcp/server/streamable_http_manager.py StreamableHTTPSessionManager.run() manages session lifecycle

Note: This is a triage report only. No code changes have been made per the instructions. The recommendation is to implement a contrib module as requested by @strawgate in the latest comment.

This actually came up in the MCP committee meeting I attended last month -- I believe the current lifespan behavior is viewed as a bug. It was intended to have server-lifespan semantics but instead has connection-lifespan semantics. However it sounds like it won't be changed immediately due to backwards compatibility concerns.

jlowin avatar Oct 04 '25 23:10 jlowin