MolecularGraph.jl icon indicating copy to clipboard operation
MolecularGraph.jl copied to clipboard

InChI layers missing from Mol object

Open timoleistner opened this issue 5 months ago • 1 comments

When converting an InChI string to a mol object and then convert the mol object back to InChI, some of the last layers are missing.

ascorbicacid_inchi = "InChI=1/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-10H,1H2/t2-,5+/m0/s1"
inchi(inchitomol(ascorbicacid_inchi))

InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-10H,1H2

Trying this with the corresponding molblock works correctly, which leads me to believe that inchi() works correctly but can't extract this layer information from the molecular graph object.

inchi("54670067
  -OEChem-02072408212D

 20 20  0     1  0  0  0  0  0999 V2000
    5.0298   -0.5357    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.3548    1.5521    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    2.4608   -0.2266    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    5.0868    2.5521    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    3.1330   -2.2957    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    5.3086   -2.2957    0.0000 O   0  0  0  0  0  0  0  0  0  0  0  0
    4.2208    0.0521    0.0000 C   0  0  2  0  0  0  0  0  0  0  0  0
    4.2208    1.0521    0.0000 C   0  0  2  0  0  0  0  0  0  0  0  0
    3.4118   -0.5357    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.0868    1.5521    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.7208   -1.4867    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.7208   -1.4867    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
    4.9782    0.4380    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    4.2208    1.6721    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.2989    0.9695    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.6974    1.6598    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    3.3548    2.1721    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.0000   -0.6415    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.6238    2.8621    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
    3.3852   -2.8621    0.0000 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  7  1  0  0  0  0
  1 12  1  0  0  0  0
  8  2  1  1  0  0  0
  2 17  1  0  0  0  0
  3  9  1  0  0  0  0
  3 18  1  0  0  0  0
  4 10  1  0  0  0  0
  4 19  1  0  0  0  0
  5 11  1  0  0  0  0
  5 20  1  0  0  0  0
  6 12  2  0  0  0  0
  7  8  1  0  0  0  0
  7  9  1  0  0  0  0
  7 13  1  6  0  0  0
  8 10  1  0  0  0  0
  8 14  1  0  0  0  0
  9 11  2  0  0  0  0
 10 15  1  0  0  0  0
 10 16  1  0  0  0  0
 11 12  1  0  0  0  0
M  END
")

InChI=1S/C6H8O6/c7-1-2(8)5-3(9)4(10)6(11)12-5/h2,5,7-10H,1H2/t2-,5+/m0/s1

This can be dangerous if structures are only available via sdf files as sdfilereader() only yields molecule objects. Doing inchi.(sdfilereader(file)) parses incorrect/incomplete InChIs.

timoleistner avatar Feb 07 '24 13:02 timoleistner