openmmtools
openmmtools copied to clipboard
Deserializing nc file generated with openmmtools < 0.21.5 yields key error when loading with openmmtools 0.21.5 because of missing constraint tolerance
from openmmtools.multistate import MultiStateReporter
main_dir = 45
replicate = 0
sub_dir = 3
phase = 'complex'
path = f"/data/chodera/zhangi/perses_benchmark/repex/perses-bnbs-paper-first-attempt/{main_dir}/{sub_dir}/replicate_{replicate}/perses-paper2/{sub_dir}/{sub_dir}_{phase}.nc"
reporter = MultiStateReporter(path, open_mode='r')
reporter.read_mcmc_moves()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Input In [24], in <cell line: 1>()
----> 1 reporter.read_mcmc_moves()
File ~/miniconda3/envs/perses-paper2/lib/python3.9/site-packages/openmmtools-0.21.5-py3.9.egg/openmmtools/multistate/multistatereporter.py:773, in MultiStateReporter.read_mcmc_moves(self)
771 for move_id in range(n_moves):
772 serialized_move = self.read_dict('mcmc_moves/move{}'.format(move_id))
--> 773 mcmc_moves.append(deserialize(serialized_move))
774 return mcmc_moves
File ~/miniconda3/envs/perses-paper2/lib/python3.9/site-packages/openmmtools-0.21.5-py3.9.egg/openmmtools/utils.py:678, in deserialize(serialization)
676 instance = cls.__new__(cls)
677 try:
--> 678 instance.__setstate__(serialization)
679 except AttributeError:
680 raise ValueError('Cannot deserialize class {} without a __setstate__ method'.format(class_name))
File ~/miniconda3/envs/perses-paper2/lib/python3.9/site-packages/openmmtools-0.21.5-py3.9.egg/openmmtools/mcmc.py:1160, in LangevinDynamicsMove.__setstate__(self, serialization)
1158 self.timestep = serialization['timestep']
1159 self.collision_rate = serialization['collision_rate']
-> 1160 self.constraint_tolerance = serialization['constraint_tolerance']
KeyError: 'constraint_tolerance'
Note that Mike was worried about this in this PR: https://github.com/choderalab/openmmtools/pull/611
Good catch! This happens when trying to directly read the mcmc moves using the public API point MultiStateReporter.read_mcmc_moves
. Just to realize that when resuming the simulations the moves are instantiated through the sampler from_storage
method. As per https://github.com/choderalab/openmmtools/pull/611#issuecomment-1201979984
My guess is that the impact of this is not big, but this is a bug, since we care for backward compatibility, and also since read_mcmc_moves
is public and should be expected to be directly run by the users. I'll be adding an example nc
file that can be used to replicate the issue.
As a possible workaround, one could use MultiStateSampler.from_storage(reporter)
and extract the mcmc_moves
from the sampler.
To reproduce the problem you can use the nc files attached here and run the following script using them
#!/bin/env python
# Directly read the mcmc_moves from reporter
from openmmtools.multistate import MultiStateReporter
reporter_path = "alanine_dipeptide_legacy.nc"
reporter = MultiStateReporter(reporter_path, checkpoint_interval=1, open_mode='r')
mcmc_moves = reporter.read_mcmc_moves()
I thought we had this tested?
I thought we had this tested?
So, it was tested by resuming simulation, which use the MultiStateSampler.from_storage
method, but we didn't really test using the reporter.read_mcmc_moves
public API point directly (my bad!).