bioscrape
bioscrape copied to clipboard
Using multiprocessing->map on Lineage simulations produces duplicate simulations
When a new process is spawned by multiprocessing's map function, Numpy's random number generator is copied exactly into all of the children, including its seed. A description of the problem here: https://github.com/numpy/numpy/issues/9650
A suggested fix, here (https://stackoverflow.com/questions/24345637/why-doesnt-numpy-random-and-multiprocessing-play-nice) is to reset numpy's random number seed at the beginning of the simulation, using OS-generated random numbers. Something like np.random.seed(int.from_bytes(os.urandom(4), byteorder='little'))
.
Are you using the py_simulate... methods? there are many places we could reset the seed but I think inside one of those wrappers is best, with an optional keyword to not reset. Any thoughts?
I haven't tried replicating this exactly, but I don't see it happening anymore when using py_SingleCellLineage
with multiprocessing.Pool.map
. Did someone fix this? Or is it specific to certain map
functions or certain lineage simulation functions?
I think the answer is to reseed the simulation as part of the multiprocessing function. I have not added any reseed functionality to any py_simulate_X functions. This seems like a reasonable addition - a keyword like seed = None. If None reseeds using numpy or can set to a specific seed.