marimo icon indicating copy to clipboard operation
marimo copied to clipboard

Support stateless for multi container scaling and deployment

Open sherodtaylor opened this issue 1 year ago • 10 comments

Description

I have a custom kubernetes deployment that requires stateless applications which has been a standard for a long time.

Suggested solution

I'd like to be able to deploy the app and any state necessary be offloaded to redis or refactoring to support stateless applications which have been a standard for a long time i.e. 12 factor apps

Alternative

Our internal kubernetes system doesn't support sticky sessions.

Additional context

error received:

  File "/bb/libexec/workflow-metrics-notebooks/python/lib/python3.10/site-packages/marimo/_server/api/deps.py", line 135, in require_current_session
    raise ValueError(f"Invalid session id: {session_id}")
ValueError: Invalid session id: s_p41hf3
import argparse
import logging
from typing import Any

import marimo
from fastapi import FastAPI
import uvicorn



LOG_LEVELS = ["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]

def parse_args(args: list[str] | None = None) -> dict[str, Any]:
    parser = argparse.ArgumentParser()
    parser.add_argument("--log-file", "-l")
    parser.add_argument("--host", default="0.0.0.0")
    parser.add_argument("--log-level", default="INFO", choices=LOG_LEVELS)
    parser.add_argument("--dir", default="./notebooks")
    parsed_args, unknown = parser.parse_known_args(args=args)
    logging.info(f'unknown args - {unknown}')
    return vars(parsed_args)



def start_server(host: str, dir: str) -> None:

    server = (
        marimo.create_asgi_app()
        .with_app(path="", root=f'{dir}/workflow-metrics.py')
    )

    # Create a FastAPI app
    app = FastAPI(host=host)
    app.mount("/", server.build())

    uvicorn.run(app, host=host, port=8080)


def main() -> None:
    args = parse_args()
    start_server(args.pop("host"), args.pop("dir"))

if __name__ == "__main__":
    main()

sherodtaylor avatar Jul 19 '24 18:07 sherodtaylor

Copying a response from Discord, in case others are also interested:

This will be difficult since not everything is serializable so can’t be stateless. These are running programs that inherently have state. Even if we made marimo stateless, your code may not be (e.g threads, db connections, etc)

I think a better request would be a load balancer that can manage multiple instances.

Based on what you know about marimo, if you have suggestions on how marimo might one day support stateless execution, we're open to hearing them.

akshayka avatar Jul 23 '24 22:07 akshayka

@akshayka one model you can follow is how Airflow serializes it's objects. It would be useful for deployed programs as programs are built to be stateless.

https://airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/serializers.html

sherodtaylor avatar Sep 24 '24 14:09 sherodtaylor

Thanks for the link. We'll likely look into this one day (perhaps we could patch globals() to hit an external cache) — I appreciate how this would make horizontal scaling very easy — but it's not on our short-term roadmap.

akshayka avatar Sep 24 '24 14:09 akshayka

@akshayka any idea if there has been any prioritization for this because i'd really like to use this. It would be nice to have a redis cache of some sort

sherodtaylor avatar Feb 06 '25 17:02 sherodtaylor

Just testing my understanding, would something like --global-session (https://github.com/marimo-team/marimo/pull/2489) coupled with caching (https://docs.marimo.io/api/caching/#marimo.persistent_cache) address this?

edit: Updated link to persistent_cache

dmadisetti avatar Feb 06 '25 17:02 dmadisetti

@dmadisetti I think that would work if there was redis integration and the global session shared in redis some how through persistant_cache maybe?

sherodtaylor avatar Feb 06 '25 18:02 sherodtaylor

I think adding redis integration to persistent_cache would be some low hanging fruit, and something we've been thinking about (just a general remote cache).

I think a persistent session becomes pretty tricky, and we'd have to choose what we want to serialize. What stateful information would you want to persist across stateless app instantiations? UI Values, mo.State? Or this generally not needed, and a fresh session ala --global-session enough? You could roll it yourself:

stateful_vars = load_stateful_from_redis()
my_state, my_ui_element, my_shared_object_with_varying_state = create_stateful_objs_from_dump(stateful_vars)
# e.g.
f""" Just a simple dashboard showing
{my_ui_element}
which is the same for every session
"""
with persistent_cache("result_ideally_loaded_from_redis", type="redis") as cache_attempt:
      do_app_level_computations(my_shared_object_with_varying_state)
if not cache_attempt.hit:
    dump_stateful_obj(my_shared_object_with_varying_state)
dump_ui_state(my_state, my_ui_element)

But open to other api suggestions, and how a more native solution could smooth this over

dmadisetti avatar Feb 06 '25 18:02 dmadisetti

i think a persistant session is necessary as the our system doesn't support sticky sessions which means that the session needs to persist remotely as the api calls with reach out via a round robin load balancer which could hit different machines.

sherodtaylor avatar Feb 07 '25 14:02 sherodtaylor

Oh I see, so down to the request level it needs to be session agnostic.

Have you considered just using the containers to create / serve a WASM page? Getting dynamic information from unsupported libraries would then be as easy as a refresh, and it would still be interactive within WASM. Alternatively, you could also serve public/ to return dynamic information (https://docs.marimo.io/guides/wasm/?h=public#including-data) - and just do as much as you can in WASM for everything else.


Maybe true stateless execution is possible? There's a lot of web-socket and miscellaneous requests- but I feel like WASM could get you most of the way there, and maybe an expanded Islands (server side initial run) could get you further: https://docs.marimo.io/guides/exporting/?h=islands#islands-in-action

dmadisetti avatar Feb 07 '25 14:02 dmadisetti

Is this issue being worked on?

I have been encountering the same issue with Marimo, when trying to deploy in a stateless manner (my system doesn't support sticky sessions and requests don't always get routed to the correct server/session).

I guess that adding an external cache (like redis) for all marimo servers to save the state (session data, etc) would resolve the aforementioned issue and would allow horizontal scaling. Is this something that can be implemented as of today? Any recommendations by the Marimo team for addressing this problem?

gnkl avatar Nov 14 '25 14:11 gnkl