rdkit icon indicating copy to clipboard operation
rdkit copied to clipboard

Unexpected apparition of RAD

Open mpagni12 opened this issue 3 years ago • 3 comments

Describe the bug

RAD may appears ex nihilo in MOL block when an heavy atom is present.

To Reproduce

from rdkit import Chem

# The first mol block is KEGG C06696 for metalic lead
mol_block_1 = """


  1  0  0  0  0  0  0  0  0  0999 V2000
   37.3800  -22.7500    0.0000 Pb  0  0  0  0  0  0  0  0  0  0  0  0
M  END
"""
mol_block_2 = """
     RDKit          2D

  1  0  0  0  0  0  0  0  0  0999 V2000
   37.3800  -22.7500    0.0000 Pb  0  0  0  0  0  2  0  0  0  0  0  0
M  END
""" 
mol_block_3 = """
     RDKit          2D

  1  0  0  0  0  0  0  0  0  0999 V2000
   37.3800  -22.7500    0.0000 Pb  0  0  0  0  0  2  0  0  0  0  0  0
M  RAD  1   1   3
M  END
"""
print( Chem.MolToMolBlock( Chem.MolFromMolBlock( mol_block_1 )) == mol_block_2 )
print( Chem.MolToMolBlock( Chem.MolFromMolBlock( mol_block_2 )) == mol_block_3 )

yields

True
True

Expected behavior

True
False

Configuration

  • RDKit version: 2022.9.1
  • OS: macOS Monterey
  • Python version (if relevant): Python 3.10.8
  • Are you using conda? no
  • If you are not using conda: how did you install the RDKit? pip3 install rdkit

Additional context There are many other entries in KEGG with an heavy atom that display the same problem. The one I am reporting here is pretty "compact".

Thank you for maintaining RDKit,

Marco

mpagni12 avatar Nov 18 '22 17:11 mpagni12

mol_block_2 and mol_block_3 are tickling the atomic valence (vvv) attribute. Setting it to anything other than 0 sets the valence to that value, and turns off implicit hydrogen detection, which is what ordinarily sets valence.

Under the pre-2014 regime, Pb was considered one of those atoms whose hydrogen count could be computed ("implicit"). After, this was disabled and it was no longer possible to compute implicit H on Pb and other elements ("MDL Valence-Mageddon").

Under pre-2014:

  • mol_block_1 encodes PbH2 (default valence 2, w/ implicit H detection).
  • mol_block_2 encodes PbH2 (explicit valence 2)
  • mol_block_3 encodes PbH2 (explicit valence 2 w/ 2e radical)

Under post-2014:

  • mol_block_1 encodes Pb (no implicit H)
  • mol_block_2 encodes PbH2 (valence 2 and so 2 implicit H)
  • mol_block_3 encodes PbH2 (valence to, 2 implicit H, and 2e electron)

It's possible that the results reflect a desire to eliminate valence ambiguity by explicitly setting valence attribute. And then RDKit is noticing that valence has been set and so sets the radical attribute.

rapodaca avatar Feb 03 '23 02:02 rapodaca

I believe that this is a bug (potentially more than one bug), but this stuff is a pain, so I'm going to have to spend some time digging to be sure.

greglandrum avatar Feb 03 '23 13:02 greglandrum

This issue was marked as stale because it has been open for 90 days with no activity.

github-actions[bot] avatar Oct 20 '24 02:10 github-actions[bot]

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions[bot] avatar Nov 05 '24 02:11 github-actions[bot]