mdanalysis icon indicating copy to clipboard operation
mdanalysis copied to clipboard

Nuclinfo Major Pair and Minor Pair overhaul

Open ALescoulie opened this issue 2 years ago • 3 comments

Fixes #3720

Changes made in this Pull Request:

  • Add a class for Major Pair and Minor Pair distance in the new overhauled nucleic acids module.

PR Checklist

  • [x] Tests?
  • [x] Docs?
  • [x] CHANGELOG updated?
  • [x] Issue raised/referenced?

ALescoulie avatar Jun 26 '22 18:06 ALescoulie

Hello @ALescoulie! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2023-10-10 06:31:35 UTC

pep8speaks avatar Jun 26 '22 18:06 pep8speaks

Codecov Report

All modified lines are covered by tests :white_check_mark:

Comparison is base (427f1a7) 93.40% compared to head (ee44dca) 93.41%.

Additional details and impacted files
@@            Coverage Diff             @@
##           develop    #3735     +/-   ##
==========================================
  Coverage    93.40%   93.41%             
==========================================
  Files          170      184     +14     
  Lines        22257    23394   +1137     
  Branches      4071     4075      +4     
==========================================
+ Hits         20790    21854   +1064     
- Misses         951     1024     +73     
  Partials       516      516             
Files Coverage Δ
package/MDAnalysis/analysis/nucleicacids.py 100.00% <100.00%> (ø)

... and 14 files with indirect coverage changes

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

codecov[bot] avatar Jun 26 '22 18:06 codecov[bot]

Before going much further here we should discuss the format of Results — please see #3744.

orbeckst avatar Jun 30 '22 23:06 orbeckst

@ALescoulie It would be great to not let this work get forgotten, are you still looking to contribute this?

IAlibay avatar Jul 31 '23 11:07 IAlibay

Yeah, I'll continue working on this. Sorry for abandoning it, I was busy with school and dealing with mental health issues. I'm getting the development repo set up on my desktop now.

ALescoulie avatar Aug 18 '23 22:08 ALescoulie

Hi @ALescoulie good to hear from you! Would be fantastic to get your work in!

orbeckst avatar Aug 18 '23 22:08 orbeckst

Linter Bot Results:

Hi @ALescoulie! Thanks for making this PR. We linted your code and found the following:

Some issues were found with the formatting of your code.

Code Location Outcome
main package ⚠️ Possible failure
testsuite ⚠️ Possible failure

Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/6465590293/job/17552033113


Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!

github-actions[bot] avatar Aug 19 '23 07:08 github-actions[bot]

I pushed a larger change to the PR. It was a bit hard picking up where I left off, but I was able to jump back into developing without too many issues.

I refactored my code so that all of the Pair distance analysis classes use a NucPairDist static method select_strand_atoms which takes in the strands, selected atoms names, and the nucleic acid names. It then returns a tuple containing the the lists of atoms for each strand. It also has checks ensuring that the strands only contain Nucleic Acids and that the selections don't return empty AtomGroups. I also added tests for both errors.

TODO

  • [x] Reformat NucPairDist to use the newer results format
  • [x] Fix misc docs issues
  • [x] Update the change log

ALescoulie avatar Aug 21 '23 06:08 ALescoulie

I like the compact code and moving the strand selection into a separate method.

  1. The primary issue is just that renaming kwargs is an API break and because the code has been in core for a while (and not marked as "Experimental" or anything like that) we'd need to properly deprecate any API changes. The alternative is using the current (old) names also for the newly added classes.
  2. In terms of API coherency, I think it were better if actual MDA objects, namely ResidueGroup, instead of list[Residue] were accepted. This would require a deprecation of the current list of residues input although that would be worth it in my opinion, so that we have a seamless OO API.
  3. There are also some documentation issues but these are easy to fix.

@orbeckst I think I have a good way to switch to using ResidueGroup from List[Residue] without API breaks. I'll make NucPairDist andWCDist accept both, but raise a DepreciatedWarning if you give it List[Residue] then just convert it internally. Than raise an issue tagged with 3,0 to remove support for residue lists when we do API changes for the major version update. What do you think of that solution?

ALescoulie avatar Aug 29 '23 21:08 ALescoulie

good idea!

orbeckst avatar Aug 30 '23 00:08 orbeckst

@orbeckst I'll push that change today, was waiting for feedback on the idea before implementing it

ALescoulie avatar Aug 30 '23 19:08 ALescoulie

I think this one is pretty close to done, There might still be some misc typos in the docs, but aside from that I refactored the code based on the suggestions of @orbeckst to use ResidueGroup and pulled selections into their own static methods. I also have gotten full code coverage, and updated CHANGELLOG.

ALescoulie avatar Aug 31 '23 00:08 ALescoulie

@ALescoulie - could you please confirm that you are ok releasing this code under a LGPLv2.1+ license and that it adheres to the developer certificate of origin?

Aside from this I will dismiss my review block and leave this to @orbeckst to review & merge.

IAlibay avatar Sep 02 '23 11:09 IAlibay

@ALescoulie - could you please confirm that you are ok releasing this code under a LGPLv2.1+ license and that it adheres to the developer certificate of origin?

Aside from this I will dismiss my review block and leave this to @orbeckst to review & merge.

I am happy to release this under the LGPL license. I wrote all of this on my own based on the older nucinfo code, and the MDAnalysis framework. I hope that it helps people with their research.

ALescoulie avatar Sep 02 '23 12:09 ALescoulie

I just removed the type alias import, apparently that wasn't in Python 3.9, @orbeckst is there anything else to fix on this PR

ALescoulie avatar Sep 04 '23 21:09 ALescoulie

@orbeckst I believe I addressed your feedback on the documentation, and added some further clarity. I also rewrote how WatsonCrickDist handles List[Resiude] in order to catch possible errors when both a List[Residue] and a ResidueGroup are passed in and in cases where the list is not entirely Residues. I think I should still add one additional test to get back to 100% coverage as there is one if statement I don't hit, I also will go read the documentation one more time.

ALescoulie avatar Sep 12 '23 00:09 ALescoulie

@orbeckst I think this PR about ready, assuming I didn't miss any typos in the docs. I also pulled most of the atom selections in the tests into a fixture so there is less repetition when running the CI.

ALescoulie avatar Sep 27 '23 18:09 ALescoulie

@orbeckst, I'll get it done by tomorrow than I can start on the next part of #3720, the pep8 stuff will just take a few minutes.

ALescoulie avatar Oct 10 '23 03:10 ALescoulie

@orbeckst I got the pep8 stuff done

ALescoulie avatar Oct 10 '23 06:10 ALescoulie

Hooray. I reopen #3720 for the other things.

orbeckst avatar Oct 10 '23 13:10 orbeckst