mdanalysis
mdanalysis copied to clipboard
RDKit converter inferring
Fixes part of #3996
Changes made in this Pull Request:
- Refactored the RDKit converter code to move the inferring code in a separate
RDKitInferringmodule. The bond order and charges inferer has been move to aMDAnalysisInfererdataclass in there. - Renamed
NoImplicitparameter toimplicit_hydrogensand added a separateinfererargument (defaults toMDAnalysisInferer(). PassingNoImplicitto any of the relevant functions will issue a warning and make the necessary arrangements to execute the code in a backwards-compatible way (i.e.implicit_hydrogens=not NoImplicitandif NoImplicit is False: inferer=None). - Added
TemplateInfererthat wraps around RDKit'sAssignBondOrdersFromTemplate. There's an additionaladjust_hydrogensparameter that when set toTrueallows one to assign bond orders from a template molecule with implicit hydrogens to an input molecule with explicit hydrogens (which won't work with the baseAssignBondOrdersFromTemplatefor charged molecules where the charged atom has a hydrogen). I originally had this code in ProLIF for dealing with PDBQT inputs, figured it would be worth here as well. - Added RDKit's
rdDetermineBondsinferring wrapper as showcased here.
PR Checklist
- [x] Tests?
- [x] Docs?
- [x] CHANGELOG updated?
- [x] Issue raised/referenced?
Developers certificate of origin
- [x] I certify that this contribution is covered by the LGPLv2.1+ license as defined in our LICENSE and adheres to the Developer Certificate of Origin.
:books: Documentation preview :books:: https://mdanalysis--4305.org.readthedocs.build/en/4305/
Hello @cbouy! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
- In the file
package/MDAnalysis/converters/RDKit.py:
Line 67:80: E501 line too long (82 > 79 characters) Line 445:80: E501 line too long (80 > 79 characters)
- In the file
package/MDAnalysis/converters/RDKitInferring.py:
Line 331:80: E501 line too long (104 > 79 characters) Line 333:80: E501 line too long (104 > 79 characters) Line 335:80: E501 line too long (104 > 79 characters) Line 337:80: E501 line too long (104 > 79 characters) Line 339:80: E501 line too long (104 > 79 characters) Line 341:80: E501 line too long (104 > 79 characters) Line 343:80: E501 line too long (104 > 79 characters) Line 345:80: E501 line too long (104 > 79 characters) Line 347:80: E501 line too long (104 > 79 characters) Line 349:80: E501 line too long (104 > 79 characters) Line 351:80: E501 line too long (104 > 79 characters) Line 353:80: E501 line too long (104 > 79 characters)
- In the file
testsuite/MDAnalysisTests/converters/test_rdkit.py:
Line 476:80: E501 line too long (80 > 79 characters) Line 904:80: E501 line too long (85 > 79 characters) Line 905:80: E501 line too long (84 > 79 characters) Line 906:80: E501 line too long (80 > 79 characters) Line 907:80: E501 line too long (88 > 79 characters)
Comment last updated at 2024-08-26 15:54:51 UTC
Linter Bot Results:
Hi @cbouy! Thanks for making this PR. We linted your code and found the following:
Some issues were found with the formatting of your code.
| Code Location | Outcome |
|---|---|
| main package | ⚠️ Possible failure |
| testsuite | ⚠️ Possible failure |
Please have a look at the darker-main-code and darker-test-code steps here for more details: https://github.com/MDAnalysis/mdanalysis/actions/runs/10563005550/job/29262240571
Please note: The black linter is purely informational, you can safely ignore these outcomes if there are no flake8 failures!
Codecov Report
Attention: Patch coverage is 98.49057% with 4 lines in your changes missing coverage. Please review.
Project coverage is 93.61%. Comparing base (
d73995a) to head (31964b4).
| Files | Patch % | Lines |
|---|---|---|
| package/MDAnalysis/converters/RDKit.py | 95.91% | 0 Missing and 2 partials :warning: |
| package/MDAnalysis/converters/RDKitInferring.py | 99.06% | 0 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## develop #4305 +/- ##
===========================================
- Coverage 93.62% 93.61% -0.01%
===========================================
Files 173 186 +13
Lines 21419 22575 +1156
Branches 3978 4004 +26
===========================================
+ Hits 20053 21134 +1081
- Misses 903 976 +73
- Partials 463 465 +2
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Thanks @cbouy ! From a quick look this seems great, I'll try to review it at some point over the next week (unless someone gets to it first).
P.S. For others that might review here - codecov seems to be throwing a bunch of "uncovered code" messages (when they seem like they are). Cycling the PR might clear them, but I don't think it's a major necessity right now.
@cbouy are you still working on the PR or is this ready for review?
Should be ready for review, I'll just need to update the changelog when ready for merging
That's great.
Can you please add the CHANGELOG update right away, even if it will require resolving a merge conflict later? The summary there tends to be really helpful for assessing a PR.... and typically no reviewer will green-light such a PR without the CHANGELOG in place anyway.
@richardjgowers do you have capacity to shepherd the PR to completion? If not please let me know and un-assign yourself. Thanks!
Sorry for the spam, should be good now!
@richardjgowers are you able to review this PR yourself or is there someone you could ping? From my very cursory glance, this looks pretty much ready and would be good to get in, given our roadmap towards "better chemistry".
Not sure why one of the azure test is timing out, or why the bot removed some of the tags but this is re-ready for review 😅
/azp run
Azure Pipelines successfully started running 1 pipeline(s).