GerryChain
GerryChain copied to clipboard
gerrychain.random sets random seed globally
gerrychain.random
sets a default random seed (2018
) globally:
https://github.com/mggg/GerryChain/blob/198f43dae397a7e361dcaa059a7ac8a4e0ff18f3/gerrychain/random.py#L5
This is bad—for starters, the state of other packages could be affected without users immediately noticing.
It turns out this is a more serious problem than it looks at first glance (in other words, I encountered a real-world use case where this matters). In particular, when searching for seed plans, the recursive_seed_part
algo occasionally gets stuck. This wouldn't be that bad of a problem if the RNG properly randomized itself. However, as the RNG
reseeds itself on every import, it'll never get un-stuck in certain settings. To be clear, this scenario that I encountered is a two-parter and is a result of:
- recursive seed part gets stuck, sometimes
- the RNG resets itself on each import of GerryChain
This leads to recursive_seed_part
being permanently stuck when used a certain way. I think the solution would be either to:
a) disable the default random seed setting function (do we really need this when tools like pcompress
exist?)
b) randomly seed the RNG only for seed partition generation (this seems tricky to do)
@pjrule looks like this is intentional: https://gerrychain.readthedocs.io/en/latest/topics/reproducibility.html?highlight=If%20None#import-random-from-gerrychain-random
Ah, so it looks like my bug with recursive_seed_part
is caused by this line at the top of tree.py
: https://github.com/mggg/GerryChain/blob/3c42e993b2e0bfafe3877c99fda4bfa76e01c65d/gerrychain/tree.py#L5