marimo icon indicating copy to clipboard operation
marimo copied to clipboard

Can't interact with tables while other cells are running

Open delennc opened this issue 8 months ago • 5 comments

Describe the bug

This might be more of a feature request, but when there are long-running cells I'd still like to be able to interact with non-dependent cell outputs. Currently it seems all table interaction functionality is blocked while any other cells are running

Environment

Replace this line with the output of marimo env. Leave the backticks in place.

Code to reproduce

https://marimo.app/l/6ewbwz

Run, and try to paginate over dataframe while sleep is executing

delennc avatar Apr 02 '25 20:04 delennc

Related is parallel execution: https://github.com/marimo-team/marimo/issues/1103

But cell output interaction shouldn't require that level of independence

dmadisetti avatar Apr 03 '25 00:04 dmadisetti

@akshayka I think a natural progression to parallel compute would be a single worker model, which could free up the kernel for calls like this

dmadisetti avatar Apr 03 '25 17:04 dmadisetti

Are you thinking threads or processes? A thread could work, but would still need to be careful that when an RPC is running in the kernel, the user doesn't subsequently run (or delete) cells that use that RPC's variables.

Maybe:

  • when a cell is running, you are allowed to run an RPC on "unrelated" cells
  • when an RPC is running, you are not allowed to run cells that depend on the owning cell's variables

However right now I am not sure if RPCs are scoped to cells, they may be able to access any of the kernel globals, which could lead to races. So we would need to make RPC semantics stricter.

akshayka avatar Apr 03 '25 17:04 akshayka

Hmm, maybe a parameter server model that manages data ownership makes sense. The disadvantage of memory transfer to the worker comes to mind, but multiprocessing.shared_memory is a thing (I have never used it though): https://docs.python.org/3/library/multiprocessing.shared_memory.html#module-multiprocessing.shared_memory

We could potentially do this only for dataframes, and use a narwhals layer if we need to standardized internally

dmadisetti avatar Apr 03 '25 18:04 dmadisetti

We actually do use shm right now, to expose data to the server without memory transfer (virtual files).

akshayka avatar Apr 04 '25 19:04 akshayka