openmmtools icon indicating copy to clipboard operation
openmmtools copied to clipboard

MultiStateReporter variable pos/vel save frequency

Open richardjgowers opened this issue 1 year ago • 14 comments

Description

allows MultiStateReporter to write positions and velocities and different frequency to energies data.

Todos

  • [x] Implement feature / fix bug
  • [x] Add tests
  • [x] Update documentation as needed
  • [x] Update changelog to summarize changes in behavior, enhancements, and bugfixes implemented in this PR

Status

  • [ ] Ready to go

Changelog message

MultiStateReporter now takes optional `position_interval` and `velocity_interval` keyword args to control the frequency these are saved at

richardjgowers avatar Jul 13 '23 12:07 richardjgowers

Will we run into key errors if you try and use this with an .nc file from an older version?

mikemhenry avatar Jul 13 '23 18:07 mikemhenry

Codecov Report

Merging #712 (d8f80b9) into main (cbff4c8) will increase coverage by 0.00%. The diff coverage is 92.85%.

:exclamation: Current head d8f80b9 differs from pull request most recent head 981fe1b. Consider uploading reports for the commit 981fe1b to get more accurate results

Additional details and impacted files

codecov[bot] avatar Jul 13 '23 19:07 codecov[bot]

@mikemhenry it shouldn't affect reading old files, I've only played with the writing code.

In terms of old code reading newer nc files, unless you've changed the defaults from writing positions/velocities on every energy write you should get the same. If you read files with missing data you now get:

>>> ds.variables['positions'][1, :, :]

masked_array(
  data=[[--, --, --],
        [--, --, --],
        [--, --, --],
        [--, --, --],
        [--, --, --],
        [--, --, --],
        [--, --, --],

so it looks like netcdf isn't erroring but instead giving you the lack of data.

I have added two attributes to the DataSet, so if you write code that expects these you're going to have a bad time.

Is this what ncfile.ConventionVersion = '0.2' is made to handle? Should I be bumping that?

richardjgowers avatar Jul 17 '23 15:07 richardjgowers

@ijpulidos the filesizes for MultiStateReporter can get pretty large, since you're saving ~11 trajectories. You typically want to save the energies at a high frequency (since you're very interesting in seeing correlation times) but positions you can have these at a lower frequency, velocities you might not even care about.

Can you point me towards the existing tests so I can make sure the tests I write fit in?

richardjgowers avatar Aug 23 '23 19:08 richardjgowers

I see, yeah I had the feeling it was a storage issue. We probably need to clean things up in general but I think this will help. Thanks for clarifying the motivation.

As per the tests, I think having them as part of the TestReporter class should be the place, we might consider moving this class to its own test module, but we don't need that right now. Thanks!

ijpulidos avatar Aug 25 '23 15:08 ijpulidos

In addition to adding tests, we also want to add

  • energy frequency

And double check what gets saved when

mikemhenry avatar Nov 30 '23 16:11 mikemhenry

We want to figure out what should be saved in the checkpoint and what should be saved in simulation.nc

mikemhenry avatar Nov 30 '23 16:11 mikemhenry

@mikemhenry Yes, I agree we probably need to discuss first what we want to store and when. And which of these are needed when resuming simulations and similar situations.

If we want to have all of these independently we probably want to sync them with the checkpointing. Such that the required values are up to date when resuming or extending simulations.

ijpulidos avatar Dec 01 '23 18:12 ijpulidos

Yah I think we need to also make sure we figure out what our analysis tools expect. My hunch is that we need to save a lot of stuff to resume a simulation, but that should be a checkpoint, the time series data we save for analysis should be user configurable.

mikemhenry avatar Dec 01 '23 18:12 mikemhenry

the time series data we save for analysis should be user configurable.

Yes, decoupling the checkpointing with the user-configurable time series data would probably work better here. I'd vote for that route if it makes sense.

As far as I can tell these are coupled. We would need to think a little bit further about this, since I'm not sure how big of a refactor this could be. There's a lot of moving back and forth between the MultiStateReporter and the MultiStateSampler.

ijpulidos avatar Dec 01 '23 19:12 ijpulidos

@ijpulidos @mikemhenry this is much closer to ready, I've added some tests so might be a good time for you both to read through

richardjgowers avatar Jan 12 '24 16:01 richardjgowers

@richardjgowers getting some errors in CI

Traceback (most recent call last):
  File "/home/runner/micromamba-root/envs/openmmtools-test/lib/python3.9/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/runner/work/openmmtools/openmmtools/openmmtools/tests/test_sampling.py", line 523, in test_write_sampler_states_no_pos
    assert restored_state.positions._value.mask.all()
AttributeError: 'numpy.ndarray' object has no attribute 'mask'

See the CI log here: https://github.com/choderalab/openmmtools/actions/runs/8052365133

Honestly if you make a commit to bump this PR, you should get the same results, I was having an issue re-starting CI

mikemhenry avatar Feb 26 '24 17:02 mikemhenry

I wonder if this is what's biting us here, though I don't know why are we expecting these arrays to be masked? https://github.com/choderalab/openmmtools/pull/701

ijpulidos avatar Mar 21 '24 16:03 ijpulidos

Now, to be fair, I don't think the changes in that linked PR have anything to do with the serialized/deserialized positions or velocities. So I doubt it.

ijpulidos avatar Mar 21 '24 16:03 ijpulidos