mdtraj
                                
                                 mdtraj copied to clipboard
                                
                                    mdtraj copied to clipboard
                            
                            
                            
                        NETCDF4 Error
If I have netCDF4 installed, I get the following error with mdtraj v1.10.0 when I try to load an .nc file. Probably related to the changes in #1878 . The nc file is created from a pdb file with mdtraj. This works fine in v1.9.9 (when I downgraded).
@janash
----> 1 a = mdtraj.load('ntl9.nc', top='ntl9.pdb')
File ~/miniconda3/envs/westpa-workshop2024/lib/python3.10/site-packages/mdtraj/core/trajectory.py:433, in load(filename_or_filenames, discard_overlapping_frames, **kwargs)
    423 filename_or_filenames = filename_or_filenames[1:]  # ignore first file
    424 try:
    425     # this is a little hack that makes calling load() more predictable. since
    426     # most of the loaders take a kwargs "top" except for load_hdf5, (since
   (...)
    431     # TODO make all the loaders accept a pre parsed topology (top) in order to avoid
    432     # this part and have a more consistent interface and a faster load function
--> 433     t = loader(tmp_file, **kwargs)
    435 except TypeError as e:
    436     # Don't want to intercept legit
    437     # TypeErrors
    438     if "got an unexpected keyword argument 'top'" not in str(e):
File ~/miniconda3/envs/westpa-workshop2024/lib/python3.10/site-packages/mdtraj/formats/netcdf.py:91, in load_netcdf(filename, top, stride, atom_indices, frame)
     88 topology = _parse_topology(top)
     89 atom_indices = cast_indices(atom_indices)
---> 91 with NetCDFTrajectoryFile(filename) as f:
     92     if frame is not None:
     93         f.seek(frame)
File ~/miniconda3/envs/westpa-workshop2024/lib/python3.10/site-packages/mdtraj/formats/netcdf.py:169, in NetCDFTrajectoryFile.__init__(self, filename, mode, force_overwrite)
    166 if mode == "w" and not force_overwrite and os.path.exists(filename):
    167     raise OSError('"%s" already exists' % filename)
--> 169 self._handle = netcdf(filename, mode=mode, **input_args)
    170 self._closed = False
    172 # self._frame_index is the current frame that we're at in the
    173 #     file
    174 # self._needs_initialization indicates whether we need to set the
    175 #     global properties of the file. This is required before the first
    176 #     write operation on a new file
TypeError: netcdf_file.__init__() got an unexpected keyword argument 'format'
After digging through it, it must be some shoddy merging because this line, which is no longer in the branch, resurfaced, which is causing all sorts of error.
https://github.com/mdtraj/mdtraj/blob/master/mdtraj/formats/netcdf.py#L161
Plausible, there were lots of moving parts around then - do you might removing that line and seeing if your file gets loaded as expected? (Should this have been caught by a test?)
Argh that's annoying - sounds like it's quite literally a single line fix and deleting it would fix things?
Do you have a test case/code to try and verify fail/success on? Or is
a = mdtraj.load('ntl9.nc', top='ntl9.pdb') sufficient to verify? If it's a single line fix we can just try patching it out
Plausible, there were lots of moving parts around then - do you might removing that line and seeing if your file gets loaded as expected? (Should this have been caught by a test?)
Without netCDF4 installed it wouldn't error out. I'm assuming the CI env file does not have netCDF4 installed.
Argh that's annoying - sounds like it's quite literally a single line fix and deleting it would fix things?
Do you have a test case/code to try and verify fail/success on? Or is
a = mdtraj.load('ntl9.nc', top='ntl9.pdb')sufficient to verify? If it's a single line fix we can just try patching it out
Seems like a single line (and with netcdf4 installed) is sufficient to test it. And removing that line fixes it. It's essentially prepping the arguments for netcdf4, but then that extra line tells mdtraj to go back to scipy, which lead to the error.
(Should this have been caught by a test?)
It looks like test_netcdf.py has some tests that evaluate md.load() implicitly but nothing is directly testing md.load()....Maybe that's also an addition we make in the fix? Both remove that line and add that additional md.load() test?
I'm assuming the CI env file does not have netCDF4 installed.
I think if our recommendation is to use this, it should be installed in CI
Without netCDF4 installed it wouldn't error out. I'm assuming the CI env file does not have netCDF4 installed.
We should list netCDF4 as a requirement for the environment then, no? It's a requirement clearly for this to function....
Seems like a single line (and with netcdf4 installed) is sufficient to test it. And removing that line fixes it. It's essentially prepping the arguments for netcdf4, but then that extra line tells mdtraj to go back to scipy, which lead to the error.
Well that's good to hear - at the very least it sounds like a quick fix of removing the line
Should we have installing netCDF4 in CI + adding a test for md.load() in a separate PR?
Without netCDF4 installed it wouldn't error out. I'm assuming the CI env file does not have netCDF4 installed.
We should list netCDF4 as a requirement for the environment then, no? It's a requirement clearly for this to function....
Seems like a single line (and with netcdf4 installed) is sufficient to test it. And removing that line fixes it. It's essentially prepping the arguments for netcdf4, but then that extra line tells mdtraj to go back to scipy, which lead to the error.
Well that's good to hear - at the very least it sounds like a quick fix of removing the line
Should we have installing netCDF4 in CI + adding a test for
md.load()in a separate PR?
The md.load() test already exists and I think the NetCDFTrajectoryFile object should already test everything as long as netcdf4 is installed.  Maybe a separate environment (or some monkey patching to mask netcdf4) should exist to test the scipy portion.
Maybe a separate environment (or some monkey patching to mask netcdf4) should exist to test the scipy portion.
This sounds like a dangerous idea but something worth discussing lol. Is this something folks use? I confess I barely use .nc files as is anyways.
In the meantime I've opened up a PR at #1894 to address this single-line-fix plus adding netCDF4 to the CI!
Hi all, I'm at a workshop today, so I've just seen the emails come in! I'm glad we've found a solution has been found to the problem.
Concerning testing, this may be a good time to use mocking to simulate dependency absences - https://docs.python.org/3/library/unittest.mock.html
This is fixed in 1.10.1. Closing this.