Fixes #3629

With ~~panedrlite now closer to~~ pyedr now being a thing, and the actual code itself less likely to still change, I have now finally started work on the EDR auxiliary reader that I want to implement as part of my GSoC project. Currently, this is just a skeleton, but it already allows EDR data to be read and assigned to AuxStep instances.

To Do:

[x] Finish implementation, allow similar features as XVGReader
[x] Write tests
[x] Improve documentation (currently barely there)
[x] Write self-contained example in documentation
[x] Change the add_auxiliary method to accommodate EDR data

PR Checklist

[x] Tests?
[x] Docs?
[x] CHANGELOG updated?
[x] Issue raised/referenced?

Jul 09 '22 11:07 BFedder

Hello @BFedder! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

In the file package/MDAnalysis/auxiliary/EDR.py:

Line 171:80: E501 line too long (88 > 79 characters) Line 174:80: E501 line too long (86 > 79 characters) Line 175:80: E501 line too long (91 > 79 characters)

In the file package/MDAnalysis/auxiliary/__init__.py:

Line 56:80: E501 line too long (91 > 79 characters) Line 57:80: E501 line too long (91 > 79 characters) Line 528:1: E402 module level import not at top of file

Comment last updated at 2022-09-12 10:12:49 UTC

Jul 09 '22 11:07 pep8speaks

Codecov Report

Base: 94.31% // Head: 94.33% // Increases project coverage by +0.02% :tada:

Coverage data is based on head (828b2dc) compared to base (93c0aac). Patch coverage: 99.35% of modified lines in pull request are covered.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3749      +/-   ##
===========================================
+ Coverage    94.31%   94.33%   +0.02%     
===========================================
  Files          192      193       +1     
  Lines        24781    24975     +194     
  Branches      3341     3369      +28     
===========================================
+ Hits         23371    23561     +190     
- Misses        1362     1365       +3     
- Partials        48       49       +1

Impacted Files	Coverage Δ
package/MDAnalysis/auxiliary/EDR.py	`98.90% <98.90%> (ø)`
package/MDAnalysis/auxiliary/XVG.py	`95.61% <100.00%> (+0.07%)`	:arrow_up:
package/MDAnalysis/auxiliary/__init__.py	`100.00% <100.00%> (ø)`
package/MDAnalysis/auxiliary/base.py	`94.01% <100.00%> (+2.69%)`	:arrow_up:
package/MDAnalysis/coordinates/base.py	`94.37% <100.00%> (+0.38%)`	:arrow_up:
package/MDAnalysis/units.py	`100.00% <100.00%> (ø)`
package/MDAnalysis/lib/pkdtree.py	`91.91% <0.00%> (-2.59%)`	:arrow_down:
MDAnalysisTests/auxiliary/base.py	`89.74% <0.00%> (-1.58%)`	:arrow_down:
package/MDAnalysis/lib/mdamath.py	`100.00% <0.00%> (ø)`
MDAnalysisTests/auxiliary/__init__.py	`100.00% <0.00%> (ø)`
... and 4 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

:umbrella: View full report at Codecov.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

Jul 20 '22 17:07 codecov[bot]

I have now adapted the tests for the trajectory-independent functionality. Next step will be the association of energy data to trajectory frames.

Jul 24 '22 20:07 BFedder

Hi @hmacdope @orbeckst @IAlibay @fiona-naughton, could I ask for your thoughts on this now? :) I believe it is ready

Aug 04 '22 16:08 BFedder

Thanks for your review IAlibay! I will upload the first bit of reworking in a bit, and think about the other points you raised over the weekend.

Aug 05 '22 15:08 BFedder

Move attaching of data from coordinates/base to auxreaders themselves. add_auxiliary shouldn’t need to know which file format is actually attached. self.add_auxiliary(AuxReader, ...) — use callback pattern in the classes.

This is now implemented, but coordinates/base will require some cleanup still.

Simplify passing of auxnames and auxterms: always require a mapping auxnames => auxterms, possibly using a dict as input (or any mapping, really).

This is partially done: auxname and auxterm have been replaced by aux_spec, which should be either a string or a dictionary that maps auxname to auxterm. For now, a string is valid here for XVGReaders so they behave as previously.

Use None as the default data selector (instead of "*")

This turns out to be a bit problematic, see the Discord GSoC channel. In a nutshell, it appears we have to either break XVGReader, be unpythonic, or use a confusing syntax were explicitly adding None causes everything to be added.

Aug 13 '22 18:08 BFedder

Ahhh, I only ran the test in MDAnalysisTests/auxiliary, not the entire testsuite before committing... And here we are. I'll fix this tomorrow

Aug 13 '22 18:08 BFedder

The API is now reworked as discussed, and most of your comments should be addressed now as well. Could I ask you to have another look please @hmacdope @orbeckst @IAlibay @fiona-naughton? :)

Aug 15 '22 17:08 BFedder

Your blog post was very useful in explaining the high level ideas. I wanted to know what happens when auxdata=None is provided? Does it ever make sense to do this, i.e., to create an empty AuxReader?

If it doesn't make sense then should we plan on changing the API for 3.0 to make auxdata the required first argument? If so, adding comments to the code and raising an issue would be good so that we don't forget. What do people think?

Aug 23 '22 20:08 orbeckst

Makes sense to me @orbeckst

Aug 23 '22 22:08 hmacdope

Thanks, @orbeckst! I don't think it would ever make sense to provide auxdata=None, I cannot think of a use case. So I think changing the order in 3.0 is a good idea.

Aug 24 '22 13:08 BFedder

@BFedder, thanks for addressing everything so quickly! Well done.

Final thing to go is just discussion around units.

Aug 26 '22 00:08 hmacdope

This new implementation now checks what units the data are in on creation of an EDRReader instance. If the units are already defined in MDAnalysis.units (pressure isn't, for example), then the data will be converted to MDA base units. This depends on a new release of pyedr, though.

Sep 01 '22 16:09 BFedder

Unit handling is now part of this PR. EDRReader holds a unit_dict that stores the units of each term as read from pyedr. During initialisation, the reader attempts to convert to MDA base units. A warning is raised if it fails to do so (as is currently the case for pressures or surface tension, for example). On conversion, the entry in unit_dict is also updated.

Sep 02 '22 13:09 BFedder

Also if you could resolve CHANGELOG stuff would be awesome.

Sep 04 '22 23:09 hmacdope

This should have addressed your reviews now, @orbeckst @hmacdope - thanks!

However, the mypy check is failing now and I don't understand why. Mamba raises the runtime error "RuntimeError: Found incorrect download: cudatoolkit. Aborting"

SHA256 mismatch seems to be the reason: SHA256 sum validation error. ERROR File not valid: SHA256 sum doesn't match expectation "/home/runner/conda_pkgs_dir/cudatoolkit-11.7.0-hd8887f6_10.tar.bz2" Expected: 31bad17839c35278203afad9c18ed15d7dd7dece11756aecf83368e1cbd9c850 Actual: e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855

I'm not sure how that would have happened as a result of my changes. What can I do to fix this?

Sep 06 '22 16:09 BFedder

Thank you! That's addressed now as well

Sep 06 '22 20:09 BFedder

@BFedder there are a bunch of PEP8 issues https://github.com/MDAnalysis/mdanalysis/pull/3749#issuecomment-1179526826 — if you could address them then that will avoid that future developers (including future you) have to deal with them.

Sep 06 '22 23:09 orbeckst

The remaining PEP8 complaints are URLs, a table, and the import done to match what's already present.

Sep 07 '22 08:09 BFedder

@IAlibay you happy with moving forward with the PEP8 the way it is?

Sep 08 '22 23:09 hmacdope

@IAlibay you happy with moving forward with the PEP8 the way it is?

I'll try to give it a read through tomorrow, but technically I'm on holiday so no promises (I'm on "pressing matters or stuff that I want to play with" only atm)

Sep 09 '22 01:09 IAlibay

First review set - base maintenance stuff.

Pyedr should remain an optional dependency for now.

I see that it is being added to setup.py... honestly I don't know, here @orbeckst @hmacdope can you please run this through the business channel? I've become increasingly reluctant to add new dependencies to core (indeed my plan is to remove half the ones we already have by 3.0). I'll agree to whatever the majority agreement is.

edit: I should note, if we make it a core requirement, that means we need to ensure that we either ship it with very small test files, or we don't ship the test files at all. The ~ 20 MB tarball with the tests is not appropriate for the core library. That also needs to be done before 2.4.0 is released.

Sep 09 '22 12:09 IAlibay

Thanks for the reviews, this should address them @IAlibay @orbeckst

Sep 12 '22 10:09 BFedder

I'm not sure what to do about the Cirrus CI failing, the error message seems like it's not due to any change I made... Could I ask for advice, please?

Sep 12 '22 11:09 BFedder

Ping @tylerjereddy? RE cirrus? I'm not so sure either.

Sep 13 '22 12:09 hmacdope

I've restarted the task, as Tyler previous detailed somewhere else, the issue seems to be purely down to Cirrus getting hammered pretty heavily by a bunch of folks that suddenly realised they had free M1 runner available. Hopefully it should complete now.

Sep 13 '22 14:09 IAlibay

Looks good now, feel free to deactivate Cirrus if it becomes annoying of course.

Sep 13 '22 15:09 tylerjereddy

I have now used a defaultdict in calc_representative, moved the comments in test_edr into class docstrings, and opened issue #3830 as a place to discuss the future of calc_representative with averages.

Sep 14 '22 15:09 BFedder

@IAlibay @orbeckst are we satisfied that pyedr is now an optional dependency and that your review questions have been addressed? I had a read over and am satisfied. :)

Sep 18 '22 23:09 hmacdope

This latest batch of reviews is now also addressed, and the checks pass. I believe the EDRReader is now ready to be merged, then :partying_face: Thanks everyone for your mentorship, reviews, and help over these past few months! I have learnt a lot and am looking forward to continue contributing to MDAnalysis.

Sep 19 '22 10:09 BFedder

mdanalysis
mdanalysis copied to clipboard

AuxReader for EDR Files

PR Checklist

Comment last updated at 2022-09-12 10:12:49 UTC

Codecov Report

mdanalysis mdanalysis copied to clipboard

AuxReader for EDR Files

PR Checklist

Comment last updated at 2022-09-12 10:12:49 UTC

Codecov Report

mdanalysis
mdanalysis copied to clipboard