GerryChain icon indicating copy to clipboard operation
GerryChain copied to clipboard

gerrychain.random sets random seed globally

Open pjrule opened this issue 3 years ago • 3 comments

gerrychain.random sets a default random seed (2018) globally: https://github.com/mggg/GerryChain/blob/198f43dae397a7e361dcaa059a7ac8a4e0ff18f3/gerrychain/random.py#L5

This is bad—for starters, the state of other packages could be affected without users immediately noticing.

pjrule avatar Aug 30 '21 20:08 pjrule

It turns out this is a more serious problem than it looks at first glance (in other words, I encountered a real-world use case where this matters). In particular, when searching for seed plans, the recursive_seed_part algo occasionally gets stuck. This wouldn't be that bad of a problem if the RNG properly randomized itself. However, as the RNG reseeds itself on every import, it'll never get un-stuck in certain settings. To be clear, this scenario that I encountered is a two-parter and is a result of:

  1. recursive seed part gets stuck, sometimes
  2. the RNG resets itself on each import of GerryChain

This leads to recursive_seed_part being permanently stuck when used a certain way. I think the solution would be either to: a) disable the default random seed setting function (do we really need this when tools like pcompress exist?) b) randomly seed the RNG only for seed partition generation (this seems tricky to do)

InnovativeInventor avatar Mar 18 '22 15:03 InnovativeInventor

@pjrule looks like this is intentional: https://gerrychain.readthedocs.io/en/latest/topics/reproducibility.html?highlight=If%20None#import-random-from-gerrychain-random

InnovativeInventor avatar May 20 '22 19:05 InnovativeInventor

Ah, so it looks like my bug with recursive_seed_part is caused by this line at the top of tree.py: https://github.com/mggg/GerryChain/blob/3c42e993b2e0bfafe3877c99fda4bfa76e01c65d/gerrychain/tree.py#L5

InnovativeInventor avatar May 20 '22 20:05 InnovativeInventor