yank
yank copied to clipboard
T4 lysozyme example with implicit solvent runs out of memory when lots of memory appears to be available
8 processes works OK: While running mpiexec.hydra -np 8 yank script --yaml=p-xylene-implicit.yaml: bash-4.2$ free total used free shared buff/cache available Mem: 131934588 5161320 114950568 1014712 11822700 124444344 Swap: 0 0 0
20 processes gives an error: While running with mpiexec.hydra -np 20 yank script --yaml=p-xylene-implicit.yaml, just before failure: bash-4.2$ free total used free shared buff/cache available Mem: 131934588 6578724 113531156 1019564 11824708 123022088 Swap: 0 0 0
The first error message and surrounding text were: <…snip…> 2022-05-20 13:23:05,043: WARNING - openmmtools.multistate.multistatesampler - Warning: The openmmtools.multistate API is experimental and may change in future releases Traceback (most recent call last): File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/yank/schema/validator.py", line 411, in call_constructor obj = subcls(**constructor_kwargs) File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/replicaexchange.py", line 217, in init super(ReplicaExchangeSampler, self).init(**kwargs) File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 203, in init self._display_cuda_devices() File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/site-packages/openmmtools/multistate/multistatesampler.py", line 1772, in _display_cuda_devices cuda_query_output = os.popen("nvidia-smi --query-gpu=index,gpu_name --format=csv,noheader").read().strip() File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/os.py", line 980, in popen bufsize=buffering) File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 729, in init restore_signals, start_new_session) File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/lib/python3.6/subprocess.py", line 1295, in _execute_child restore_signals, start_new_session, preexec_fn) OSError: [Errno 12] Cannot allocate memory
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/projects/mrmdesign/MCMD/CONDA_ENVS/yank-badger/bin/yank", line 10, in
For what it's worth, I get an entirely different error with -np 25 (so perhaps I am just running things incorrectly since I count 25 lambda values for the complex system):
<...snip...> Warning: importing 'simtk.openmm' is deprecated. Import 'openmm' instead.
=================================================================================== = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES = PID 6939 RUNNING AT ba173 = EXIT CODE: 11 = CLEANING UP REMAINING PROCESSES = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11) This typically refers to a problem with your application. Please see the FAQ page for debugging suggestions