yank icon indicating copy to clipboard operation
yank copied to clipboard

CRITICAL: Experiment NaN !

Open kingljy0818 opened this issue 4 years ago • 5 comments

Hi,

When I perform the .yaml, always prompt the NaN error in the follow:


2020-04-13 23:08:03,375: Single node: executing <bound method MultiStateReporter.write_energies of <openmmtools.multistate.multistatereporter.MultiStateReporter object at 0x7f6b1e3b7350>> 2020-04-13 23:08:03,378: ******************************************************************************** 2020-04-13 23:08:03,378: Iteration 1/500 2020-04-13 23:08:03,378: ******************************************************************************** 2020-04-13 23:08:03,378: Single node: executing <function ReplicaExchangeSampler._mix_replicas at 0x7f6b202b4b00> 2020-04-13 23:08:03,378: Mixing replicas... 2020-04-13 23:08:03,401: Mixing of replicas took 0.023s 2020-04-13 23:08:03,401: Accepted 166662/781250 attempted swaps (21.3%) 2020-04-13 23:08:03,401: Propagating all replicas... 2020-04-13 23:08:03,401: Running _propagate_replica serially. 2020-04-13 23:08:27,631: WARNING - openmmtools.mcmc - Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2020-04-13 23:08:33,960: WARNING - openmmtools.mcmc - Potential energy is NaN after 1 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2020-04-13 23:08:39,945: WARNING - openmmtools.mcmc - Potential energy is NaN after 2 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2020-04-13 23:08:45,901: WARNING - openmmtools.mcmc - Potential energy is NaN after 3 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2020-04-13 23:08:51,945: WARNING - openmmtools.mcmc - Potential energy is NaN after 4 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2020-04-13 23:08:58,094: ERROR - openmmtools.mcmc - Potential energy is NaN after 5 attempts of integration with move LangevinSplittingDynamicsMove Trying to reinitialize Context as a last-resort restart attempt... 2020-04-13 23:09:06,191: ERROR - openmmtools.mcmc - Potential energy is NaN after 6 attempts of integration with move LangevinSplittingDynamicsMove 2020-04-13 23:09:16,098: CRITICAL - openmmtools.multistate.multistatesampler - Propagating replica 4 at state 1 resulted in a NaN! The state of the system and integrator before the error were saved in LZ03-mertk-explicit-output/experiments/nan-error-logs 2020-04-13 23:09:16,103: CRITICAL - yank.experiment -

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ! CRITICAL: Experiment NaN ! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! The following experiment threw a NaN! It should NOT be considered! Experiment: LZ03-mertk-explicit-output/experiments/ !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

2020-04-13 23:09:16,124: Single node: executing <bound method MultiStateReporter.close of <openmmtools.multistate.multistatereporter.MultiStateReporter object at 0x7f6b1e3b7350>> Please cite the following:

    Friedrichs MS, Eastman P, Vaidyanathan V, Houston M, LeGrand S, Beberg AL, Ensign DL, Bruns CM, and Pande VS. Accelerating molecular dynamic simulations on graphics processing unit. J. Comput. Chem. 30:864, 2009. DOI: 10.1002/jcc.21209
    Eastman P and Pande VS. OpenMM: A hardware-independent framework for molecular simulations. Comput. Sci. Eng. 12:34, 2010. DOI: 10.1109/MCSE.2010.27
    Eastman P and Pande VS. Efficient nonbonded interactions for molecular dynamics on a graphics processing unit. J. Comput. Chem. 31:1268, 2010. DOI: 10.1002/jcc.21413
    Eastman P and Pande VS. Constant constraint matrix approximation: A robust, parallelizable constraint method for molecular simulations. J. Chem. Theor. Comput. 6:434, 2010. DOI: 10.1021/ct900463w
    Chodera JD and Shirts MR. Replica exchange and expanded ensemble simulations as Gibbs multistate: Simple improvements for enhanced mixing. J. Chem. Phys., 135:194110, 2011. DOI:10.1063/1.3660669
    

dcdplugin) Could not access file 'LZ03-mertk-explicit-output/experiments/trailblaze/complex/coordinates.dcd'. dcdplugin) Could not access file 'LZ03-mertk-explicit-output/experiments/trailblaze/solvent/coordinates.dcd'.


Would need your help to solve this problem. Many thanks.

kingljy0818 avatar Apr 14 '20 10:04 kingljy0818

Since the NaN occurs at the first iteration, it makes me think that either the initial structure you feed to YANK has clashes and the minimization was not successful. If you are not minimizing, I'd try first setting minimize: yes. Otherwise, you'll probably have to fix the initial structure.

andrrizzi avatar Apr 16 '20 14:04 andrrizzi

I have used minmize:yes parameter. The binding mode of the tested small molecule against the receptor was obtained by conformational superposition with the ligand in the crystal structure complex, this may be caused the NaN.

kingljy0818 avatar Apr 16 '20 15:04 kingljy0818

YANK often fails in the minimization step since it employs GPUs. In such a case, MD softwares like Amber say that you have to use CPUs. So, you have to minimize your structure using other MD softwares and then feed it to YANK. I think the authors of YANK have to implement a CPU version minimizer and add an option to choose GPU or CPU minimizer.

jslim-furame avatar Feb 10 '21 12:02 jslim-furame

I have the same problem, it is a native protein-ligand structure 2021-12-03 20:55:01,679: Replica 24/25: final energy -764087.556kT 2021-12-03 20:55:06,345: Replica 25/25: initial energy -757730.894kT 2021-12-03 20:55:06,345: Using FIRE: tolerance 1.0 kJ/(nm mol) max_iterations 1000 2021-12-03 20:55:18,859: Replica 25/25: final energy -764071.022kT 2021-12-03 20:55:18,996: Single node: executing <bound method MultiStateReporter.write_sampler_states of <openmmtools.multistate.multistatereporter.MultiStateReporter object at 0x7f506ce3a7c0>> 2021-12-03 20:55:19,103: Storing sampler states took 0.106s 2021-12-03 20:55:19,104: Minimizing all replicas took 422.187s 2021-12-03 20:55:19,116: Running _compute_replica_energies serially. 2021-12-03 20:56:16,178: Computing energy matrix took 57.063s 2021-12-03 20:56:16,179: Single node: executing <bound method MultiStateReporter.write_energies of <openmmtools.multistate.multistatereporter.MultiStateReporter object at 0x7f506ce3a7c0>> 2021-12-03 20:56:16,180: ******************************************************************************** 2021-12-03 20:56:16,180: Iteration 1/500 2021-12-03 20:56:16,180: ******************************************************************************** 2021-12-03 20:56:16,180: Single node: executing <function ReplicaExchangeSampler._mix_replicas at 0x7f507780b8b0> 2021-12-03 20:56:16,180: Mixing replicas... 2021-12-03 20:56:16,196: Mixing of replicas took 0.016s 2021-12-03 20:56:16,196: Accepted 365402/781250 attempted swaps (46.8%) 2021-12-03 20:56:16,196: Propagating all replicas... 2021-12-03 20:56:16,196: Running _propagate_replica serially. 2021-12-03 20:57:17,404: WARNING - openmmtools.mcmc - Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2021-12-03 20:57:36,557: WARNING - openmmtools.mcmc - Potential energy is NaN after 1 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2021-12-03 20:59:53,568: WARNING - openmmtools.mcmc - Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... 2021-12-03 21:02:04,707: WARNING - openmmtools.mcmc - Potential energy is NaN after 0 attempts of integration with move LangevinSplittingDynamicsMove Attempting a restart... And I noticed a strange thing, the same input in a local Linux yank with only CPU would have different potential energy from that on an HPC cluster with a GPU, any idea?

quantaosun avatar Dec 03 '21 13:12 quantaosun

Thanks for reporting this! Are you able to upload a ZIP file with all your input files and YAML file for us to try to reproduce?

It would also help if you could include your conda environment, generated with conda env export > environment.yml, so we can see if there is some dependence on a particular version of an upstream dependency.

jchodera avatar Dec 06 '21 18:12 jchodera