vivarium-core
vivarium-core copied to clipboard
Proposal: Improved Handling of Parallel Processes
Current Approach
Message Passing
When a process has Process.parallel
set to true, we launch a parallel OS process that runs vivarium.core.process._run_update()
. run_update()
contains a loop that repeatedly receives (interval, state)
tuples from a pipe, passes those arguments to Process.next_update()
, and sends the result back through the pipe. To stop the parallel process, we pass interval=-1
.
Tracking in Main OS Process
In the main OS process (which contains Engine
), we store ParallelProcess
instances in Engine.parallel
. Then when we need an update from a process, we call Engine._invoke_process
, which does the following:
- If the process is parallel, starts the
ParallelProcess
computing the update and returns it. - If the process is non-parallel, creates and returns an instance of
InvokeProcess
, which has an interface similar toParallelProcess
but computes the update immediately. Note that there's an extrainvoke_process
function that computes the process's update, but this extra level of indirection appears unnecessary.
Problems
The way we currently track parallel processes has a number of downsides:
- We store
ParallelProcess
instances inEngine.parallel
, but we also store the originalProcess
instances in the store hierarchy (with a reference inself.processes
).- This is unnecessary and wastes memory, especially for processes with large parameter dictionaries or internal states.
- Having the original
Process
instances in the store hierarchy is confusing. A user can read out the internal state of those processes with no problem, but they're getting the state from a process that hasn't changed since the beginning of the simulation, which is not intuitive.
- There's no way to retrieve any information from a process besides its next update. This is a problem when we want to read out a process's internal state (e.g. for debugging or when working with inner simulations).
- The extra levels of indirection are confusing. Every time I work on this, I have to trace through the code again to remind myself what all the "invoke" things do.
Proposed Solution
-
Eliminate the extra
invoke_process
function that doesn't appear to do anything. -
Instead of storing
ParallelProcess
instances inself.parallel
, put them directly into the store hierarchy with references inself.processes
. Once a parallel process has been put on its own OS process, there should be no copies of it left in the main OS process. -
Systematize message-passing between the main and parallel OS processes as commands with arguments.
Process
will have arun_command()
method that accepts a command name (string) and a tuple of arguments. The_run_update()
function would handle the following commands:-
_halt
: Ignores arguments and shuts down the parallel process (equivalent of passinginterval=-1
currently) -
_next_update
: Takes(interval, state)
as arguments and passes them toProcess.next_update()
Process authors can override
run_command()
to handle more commands, e.g. to return internal state variables.ParallelProcess
will also providerun_command()
, but instead of running commands itself, it will pass those commands through the pipe to its child process's_run_update()
function, which will in turn pass them toProcess.run_command()
.I've started implementing this in #198
-
I think this proposal addresses all the problems described above. However, it brings some new downsides:
- Processes in the store hierarchy will not all be
Process
instances anymore. Some will beParallelProcess
instances. We don't want to makeParallelProcess
inherit fromProcess
because it doesn't know enough to do things like generate composites, and adding those missing capabilities is overkill. - This would change the behavior of public functions. I think this is really more of a bug fix than a breaking change since the way we currently store
Process
instances in the store hierarchy is very counter-intuitive. The biggest breaking change is probably removingEngine.parallel
, but I doubt anyone relies on that.