swyft icon indicating copy to clipboard operation
swyft copied to clipboard

Add random state for reproducibility

Open fnattino opened this issue 3 years ago • 5 comments

Addressing #46: In order to have reproducible simulation sampling in the store, we should have the possibility of seeding the random number generation. My impression is that the seeding is probably best introduced within Bound, so that one could do:

bound = UnitCubeBound(n_parameters)
bound.set_seed(1234)
store.add(100, prior, bound)
store.add(100, prior, bound) # add more points following the previous stream of random numbers

The disadvantage is that one needs to instantiate the Bound object outside the store.

Tasks:

  • [x] code
  • [x] tests (?)
  • [ ] tutorial notebooks

fnattino avatar Nov 11 '21 12:11 fnattino

Do you agree @bkmi ?

fnattino avatar Nov 11 '21 12:11 fnattino

I like what you propose, but I haven't groked how this may affect the new store definition. I suppose we should try to merge the new store specification first then incorporate this request?

I'm still thinking about instantiating the bound outside the store... I think it's okay because the "deepest" function which will generate the bounds is most likely this one...

Perhaps we can provide an option to give a current rng, then "split" it... I'm not sure about that yet. I'm not super comfortable with these new randomness generators.

bkmi avatar Nov 30 '21 13:11 bkmi

@bkmi Hi Ben, thanks for the feedback. I have adapted the current modifications to the `simplify branch as much as possible. and added some minimum tests.

Indeed we can incorporate this one later. I did not update the notebook with the bound seeding method, since the tutorial notebooks are updated in master.

rogerkuou avatar Nov 30 '21 14:11 rogerkuou

the notebooks from master have been brought into simplify now.

bkmi avatar Nov 30 '21 15:11 bkmi

(Thanks a lot @rogerkuou for continuing with this work which I had left halfway!)

Perhaps we can provide an option to give a current rng, then "split" it...

@bkmi do you mean to provide the possibility to set a seeded rng in Store and then propagate this to all the calls to pdf.sample()? We could avoid storing rng as an attribute into the bound, but do something like:

class Bound:
  ...
  def sample(self, n_samples, rng=None):
    rng = rng or np.random.default_rng()
    ...
  ...

In this way one would not need to seed the bound outside the store - one would seed the store instead. Also, the seeded rng in the store could be used to draw samples from the Poissonian here - which I had overlooked earlier - and which would still not be covered if we were to store rng in the bound..

fnattino avatar Dec 01 '21 22:12 fnattino

This is a pull request for v0.3, which is not actively developed. Closed for now, although the topic of reproducibility is relevant for v0.4 as well.

cweniger avatar May 13 '23 21:05 cweniger