xyz2mol icon indicating copy to clipboard operation
xyz2mol copied to clipboard

Convert xyz file with several structures in it (different molecules or same molecule, conformers)

Open stnrl opened this issue 2 years ago • 2 comments

Is it possible to convert several molecules in xyz format that are in the same file to an sdf output? For the moment, It works for me but only with single molecule per file.

stnrl avatar Oct 17 '22 13:10 stnrl

No, that's not currently possible

jhjensen2 avatar Oct 17 '22 13:10 jhjensen2

@stnrl It seems to depend on the input.

  • An example where this works: Generation of a model including pyridine and diethyl ether in one data set with OpenBabel (3.1.1 -- Sep 8 2022 -- 12:29:09, Linux Debian 12/bookworm) by
obabel -:"c1ccncc1.CCOCC" -h --gen3d -O py_Et2O.xyz

yields a test data set similar to

26

C          0.66023        0.62902       -1.81071
C         -0.16353        1.61570       -1.27828
C         -1.41741        1.24497       -0.81939
N         -1.88558       -0.02202       -0.85568
C         -1.06234       -0.95946       -1.37401
C          0.20711       -0.68586       -1.85859
C          1.14077       -0.37698        1.79810
C          2.46962       -0.62285        1.11274
O          3.48114       -0.52854        2.10885
C          4.77258       -0.85279        1.60438
C          5.74302       -0.75889        2.76427
H          1.65370        0.88115       -2.17036
H          0.16494        2.64694       -1.21534
H         -2.09635        1.97721       -0.39187
H         -1.45720       -1.97127       -1.37534
H          0.83206       -1.48070       -2.24925
H          0.30852       -0.48442        1.10511
H          1.11444        0.62717        2.23061
H          1.00003       -1.08648        2.61918
H          2.64920        0.13030        0.33917
H          2.47898       -1.61692        0.65292
H          5.05736       -0.15356        0.81065
H          4.77092       -1.87231        1.20165
H          5.80440        0.27335        3.12474
H          6.74216       -1.09244        2.47453
H          5.39381       -1.36611        3.60588

The subsequent call

$ python xyz2mol.py ./examples/py_et2o.xyz -o sdf > recovery.sdf

yields


     RDKit          3D

 26 25  0  0  0  0  0  0  0  0999 V2000
    0.6602    0.6290   -1.8107 C   0  0  0  0  0  0  0  0  0  0  0  0
   -0.1635    1.6157   -1.2783 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4174    1.2450   -0.8194 C   0  0  0  0  0  0  0  0  0  0  0  0
   -1.8856   -0.0220   -0.8557 N   0  0  0  0  0  0  0  0  0  0  0  0
   -1.0623   -0.9595   -1.3740 C   0  0  0  0  0  0  0  0  0  0  0  0
    0.2071   -0.6859   -1.8586 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.1408   -0.3770    1.7981 C   0  0  0  0  0  0  0  0  0  0  0  0
    2.4696   -0.6229    1.1127 C   0  0  0  0  0  0  0  0  0  0  0  0
    3.4811   -0.5285    2.1088 O   0  0  0  0  0  0  0  0  0  0  0  0
    4.7726   -0.8528    1.6044 C   0  0  0  0  0  0  0  0  0  0  0  0
    5.7430   -0.7589    2.7643 C   0  0  0  0  0  0  0  0  0  0  0  0
    1.6537    0.8811   -2.1704 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.1649    2.6469   -1.2153 H   0  0  0  0  0  0  0  0  0  0  0  0
   -2.0964    1.9772   -0.3919 H   0  0  0  0  0  0  0  0  0  0  0  0
   -1.4572   -1.9713   -1.3753 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.8321   -1.4807   -2.2492 H   0  0  0  0  0  0  0  0  0  0  0  0
    0.3085   -0.4844    1.1051 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.1144    0.6272    2.2306 H   0  0  0  0  0  0  0  0  0  0  0  0
    1.0000   -1.0865    2.6192 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.6492    0.1303    0.3392 H   0  0  0  0  0  0  0  0  0  0  0  0
    2.4790   -1.6169    0.6529 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.0574   -0.1536    0.8106 H   0  0  0  0  0  0  0  0  0  0  0  0
    4.7709   -1.8723    1.2017 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.8044    0.2733    3.1247 H   0  0  0  0  0  0  0  0  0  0  0  0
    6.7422   -1.0924    2.4745 H   0  0  0  0  0  0  0  0  0  0  0  0
    5.3938   -1.3661    3.6059 H   0  0  0  0  0  0  0  0  0  0  0  0
  1  2  2  0
  1  6  1  0
  1 12  1  0
  2  3  1  0
  2 13  1  0
  3  4  2  0
  3 14  1  0
  4  5  1  0
  5  6  2  0
  5 15  1  0
  6 16  1  0
  7  8  1  0
  7 17  1  0
  7 18  1  0
  7 19  1  0
  8  9  1  0
  8 20  1  0
  8 21  1  0
  9 10  1  0
 10 11  1  0
 10 22  1  0
 10 23  1  0
 11 24  1  0
 11 25  1  0
 11 26  1  0
M  END

one may read again into obabel for a visualization, e.g.

obabel recovery.sdf -O recovery.svg

positive

or Jmol

positive_2

  • And an example, where this does not work:
obabel -:"c1ccncc1.c1ccccc1" -h --gen3d -O py_benzene.xyz
python xyz2mol.py benz_py.xyz -o sdf > recovery2.sdf

because the OpenBabel's intermediate .xyz file describes a geometry of the two molecules as if they there intercalated:

23

C          1.09472        0.18388        0.62199
C          0.22445       -0.78231       -0.93482
C         -1.53084       -0.58986       -1.16195
C         -2.38027        0.44777       -0.05584
C         -1.61232        1.31553        1.26254
C          0.07673        1.21771        1.63532
C          2.46037       -0.39474       -1.07462
C          1.41813        0.99491       -1.36746
C         -0.13693        1.02962       -0.55139
N         -0.60992       -0.44977        0.64771
C          0.51589       -1.80602        0.84480
C          2.03336       -1.76142       -0.01435
H          2.12876        0.18703        0.97545
H          0.64176       -1.48502       -1.67530
H         -2.06532       -1.17270       -1.90369
H         -3.45972        0.54416       -0.15900
H         -2.22595        1.99641        1.84888
H          0.52390        1.81605        2.41714
H          3.46081       -0.37413       -1.50611
H          1.75944        1.86006       -1.92070
H         -0.79524        1.88733       -0.69495
H          0.13597       -2.61848        1.45282
H          2.73482       -2.58428        0.08258

If you read the later .xyz file e.g., into Jmol (and use the optional Edit -> Preferences -> Bonds -> Compute Bonds Automatically, if not used by you by default, you have to confirm this by Apply in addition), there is a good reason xyz2mol does not provide a new record:

negative


Because the example of pyridine and diethyl ether worked reasonably well, perhaps there is a way to introduce some «pertubation» into OpenBabel's action to report benzene and pyridine as two molecules more remote of each other than here. However, I do not recall if OpenBabel's documentation describes how to leave this local energetic minimum when running the --gen3d optimization.

nbehrnd avatar Oct 17 '22 18:10 nbehrnd