Next-gen batch runner
Objective
The goal of this proposal is to redesign the Mesa batch runner into a modular, flexible system that separates the batch run process into three stages: Preparation, Running, and Processing. The focus will be on the Preparation stage, where different experimental designs can be used to generate run configurations. These configurations will be encapsulated in a dataclass that includes the model class and all relevant parameters, ensuring reusability in the Running stage.
Design Overview
-
Preparation Stage:
- Use a dataclass (
RunConfiguration) to store the model class, run parameters, and configuration details (e.g.,max_steps,data_collection_period). - Implement different configuration generators (e.g., full factorial, sparse grids, manual) to allow for flexible experiment designs.
- Use a dataclass (
-
Running Stage:
- The batch runner will execute all configurations using multiprocessing when necessary. It will take a list of
RunConfigurationobjects and execute each run independently. - Results will be collected during execution and processed after all runs are completed.
- The batch runner will execute all configurations using multiprocessing when necessary. It will take a list of
-
Processing Stage:
- Results from the batch run will be processed into a usable format (e.g., a list of dictionaries, pandas DataFrames) for further analysis.
Key Components
1. RunConfiguration Dataclass
This dataclass stores all the information required to run a single configuration of the experiment.
from dataclasses import dataclass
from typing import Any, Dict
@dataclass
class RunConfiguration:
model_cls: type[Model]
run_id: int
iteration: int
parameters: Dict[str, Any]
max_steps: int
data_collection_period: int
2. Configuration Generators
Provide different strategies for generating configurations:
- Full Factorial: Generate all combinations of parameters.
- Sparse Grid: Sample a subset of parameter space.
- Base case: Start with a reference scenario and vary parameters from there.
- Manual Configuration: Allow users to specify configurations explicitly.
Each generator will output a list of RunConfiguration objects.
3. Batch Runner Class
The BatchRunner class will manage the execution of all runs using the RunConfiguration objects. It will handle multiprocessing, progress tracking, and result collection.
class BatchRunner:
def __init__(
self,
configurations: List[RunConfiguration],
number_processes: int | None = 1,
display_progress: bool = True,
):
self.configurations = configurations
self.number_processes = number_processes
self.display_progress = display_progress
def run_all(self) -> List[Dict[str, Any]]:
# Core logic to run all configurations in parallel or serially
Experimental designs might be one of the most important new things to support. I encountered this library that might be useful:
- https://github.com/relf/pyDOE3
Nice initiative! One thing to note is that the current batch_run function already is somewhat defined around these 3 stages. If you look at the code, it is composed of a "make_kwargs" function (corresponding to stage 1), a "run" function (stage 2.1) and a "collect" function (stage 2.2). Currently it just returns the collected data afterwards in a neutral format (dict), but I originally envisioned a further processing function or functions that do something useful with the result (stage 3).
So I think your vision aligns nicely with the current structure. And I agree that the most important area of improvement is stage 1 and a clear "run configuration" definition.
I like the conceptual design. I would however design it to be easy to extend / combine with whatever experimental design generator you want to use, rather than try and cover all of that ourselves. The same applies to the subsequent stages.
The motivation for this is that doing large-scale computational experimentation is its own can of worms and not, in my view, the core of the MESA library. It is easy to go overboard with trying to built on this into MESA, but making it less and less useful for others. To wit, last week I spoke with various people who use NetLogo and do large scale uncertainty quantification. None of them use Netlogo's behavior space but all use other packages that interface with NeLogo via java. So, it is more important in my view to establish a clean API for running a single experiment on a MESA model, then do design a very elaborate batch runner.
Agree with @quaquel, but I think this is somewhat in line with what @EwoutH was proposing, in my understanding. The RunConfiguration should be that interface. Which tools you use to generate it is up to you, but we provide some basic configuration generators.
Although maybe RunConfiguration should be split into ModelConfiguration and RunConfiguration. The former details how and individual model should look like, and the latter how it is run. So we create a list(?) of model configurations and then pass that to the run configuration which describes how stage 2 and 3 should be handled.
I guess there is a distinction between inputs to creating experiments and the individual experiments. To start with the latter, this can be as simple as a dict of key value pairs. Typically this will be passed directly to the __init__ of the model.
The other is more subtle, and I lack a good name for it. It is basically the parameter space and some density function over this space. in the simplest case, this space is bounded, the axes are orthogonal to one another (i.e., they are independent, that is, there are no correlations), and you assume a uniform distribution over the space (so all points are equally likely). Each of these assumptions can be relaxed but they make your life increasingly more difficult. Moreover, you have to specify how you want to sample points from this space (monte carlo, LHS, some factorial design, etc.), and you have to specify how many points you want to sample. All this interacts in a messy way. For example, if you have a factorial design, you normally specify the number of points on each dimension. If you have a Monte Carlo sampler, you specify how many points in total you wan want to sample.
Given all this, you have either a collection of experiments or an experiment generator. This you pass to the runner, which then executes them (potentially in parallel). It is only the last task that is properly the batch runner. The rest is the design of experiments.
an additional minor concern in that you typically want to run each experiment for multiple seeds. You can collapse the seed into the experiment or delay it and let it be handled by the batch runner. Regardless, you need to track the seed number, of course, for replication purposes of each experiment.
https://github.com/projectmesa/mesa/discussions/2776 implies the importance of allowing users to (easily) construct their own RunConfigurations. Many users have design points from external experimental design tools or specific scenarios they want to test without running all possible combinations.
Hi, are there any updates on the status of the batch runner? I am running into a major issue where if you run the batch runner with multiple iterations and a specified random seed, the same seed is used for each iteration, essentially defeating the purpose of the random seed (see my question at #2835 ). Most of the workarounds I see involve using a for loop or something similar, but this would make parallelization (which is important for what I am doing) more difficult. Does anyone have another workaround, or is this something that could be implemented in the near-term? This seems like a patch that would definitely be useful for a lot of people.
Thanks for reaching out! I replied in #2835
Potential GSoC 2026 project: https://github.com/projectmesa/mesa/discussions/2927#discussioncomment-15181136