activitysim
activitysim copied to clipboard
Reproducible random initialization fails on init in multiprocess
By default when creating subprocesses, windows and macOS “spawn” new processes, while on Linux you “fork”. https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
Spawn is slow, but makes a new clean python instance with nothing preloaded. Fork is faster, and also you inherit whatever you had in Python before. For ActivitySim, in the MP init step, the subprocess may have access to previously loaded tables, but not the random number generator. Since the ‘init’ subprocess sees the tables that are there, it doesn’t reload them, and doesn’t trigger the random value generator initialization. If the init
subprocess then calls for a random number, it will crash.
Possible solutions include using spawn on all platforms for consistency, or checking and/or re-initializing random number generators, tracing setup, etc, upon opening a pipeline, or force-wiping existing tables and reloading them when opening a pipeline.
Some context here about why this issue cropped up: As part of the shadow pricing work (#613), there is a step where a data buffer is created to store the workplace or school locations of persons across all sub-processes when running in multi-process mode. In order to index this data buffer by person_id, the person table needs to be read in before the sub-processes are spun up. By calling the person table by the inject.get_table('persons'), which reads the person table and also sets up tracing and initializes the random number generator. As described above, this initialization of the random number generator before splitting off sub-processes breaks when running on Linux.
A workaround fix was implemented to just read in the raw persons table instead of the "injectable" persons table. This skips the unnecessary initialization of the random number generators and tracing at this stage of the code and skirts this Linux vs Windows issue. It also avoids any unforeseen issues that might arise when messing with the way the sub-processes are spun up between operating systems.