mdshare
mdshare copied to clipboard
Get access to our MD data files.
mdshare
Get access to our MD data files.
This is a downloader for molecular dynamics (MD) data from a public FTP server at FU Berlin. See here for a full list of available datasets and terms of use.
Example
This code will download a file (if it does not already exist locally) with a featurized set of three alanine dipeptide MD trajectories and store its content of three numpy.ndarray objects (each of shape=[250000, 2], dtype=numpy.float32) in the list trajs:
import mdshare
import numpy as np
local_filename = mdshare.fetch('alanine-dipeptide-3x250ns-backbone-dihedrals.npz')
with np.load(local_filename) as fh:
trajs = [fh[key] for key in sorted(fh.keys())]
By default, the mdshare.fetch() function will look in and download to the current directory (function parameter working_directory='.'). If you instead set this parameter to None ...
local_filename = mdshare.fetch(
'alanine-dipeptide-3x250ns-backbone-dihedrals.npz',
working_directory=None)
... the file will be downloaded to a temporary directory. In both cases, the function will return the path to the downloaded file.
Should the requested file already be present in the working_directory, the download is skipped.
Using mdshare.catalogue() to view the files and filesizes of the available trajectories ...
mdshare.catalogue()
... produces the output:
Repository: http://ftp.imp.fu-berlin.de/pub/cmb-data/
Files:
alanine-dipeptide-0-250ns-nowater.xtc 42.9 MB
alanine-dipeptide-1-250ns-nowater.xtc 42.9 MB
alanine-dipeptide-2-250ns-nowater.xtc 42.9 MB
alanine-dipeptide-3x250ns-backbone-dihedrals.npz 6.0 MB
alanine-dipeptide-3x250ns-heavy-atom-distances.npz 135.0 MB
[...]
Containers:
mdshare-test.tar.gz 193.0 bytes
pyemma-tutorial-livecoms.tar.gz 123.9 MB
Using mdshare.search(filename_pattern) to select for a given group of files ...
pentapeptide_xtcs = mdshare.search('penta*xtc')
print(pentapeptide_xtcs)
... produces the output:
['pentapeptide-00-500ns-impl-solv.xtc',
'pentapeptide-01-500ns-impl-solv.xtc',
'pentapeptide-02-500ns-impl-solv.xtc',
...
'pentapeptide-22-500ns-impl-solv.xtc',
'pentapeptide-23-500ns-impl-solv.xtc',
'pentapeptide-24-500ns-impl-solv.xtc']