pymatgen icon indicating copy to clipboard operation
pymatgen copied to clipboard

Load bonds from .cif to make StructureGraph

Open alex-l-m opened this issue 2 years ago • 5 comments

There is a StructureGraph class, and some .cif files include bonding information. Is there any interest in creating StructureGraph's from the bonding information in a .cif file?

alex-l-m avatar Nov 04 '21 02:11 alex-l-m

That would be nice. Can you point to the CIF specification? Last time I looked into it, it was unclear to me how CIF defined bonds that crossed periodic boundaries.

If we do add, I think StructureGraph -> CIF export would also be useful.

mkhorton avatar Nov 08 '21 23:11 mkhorton

The bond information in CIF files is defined by the GEOM_BOND category. It's kind of annoying to set up due to the non-intuitive periodicity flags, but it can be done. Bonding across boundaries is specified using a flag of the format 1_XYZ. If I recall correctly (it's been a long time...), it's such that any number after the underscore other than 5 indicates a cross over a periodic boundary. For example, I think 1_545 specifies that it crosses the boundary in the -y dimension, whereas 1_556 specifies that it crosses the boundary in the +z dimension. Or something very similar to this.

Andrew-S-Rosen avatar Nov 15 '21 01:11 Andrew-S-Rosen

Thanks @arosen93, that info on the periodicity labels _geom_bond_site_symmetry_1 etc. was exactly what I was missing. More info on that tag here: https://www.iucr.org/_data/iucr/cifdic_html/1/cif_core.dic/Igeom_bond_site_symmetry.html

mkhorton avatar Nov 16 '21 05:11 mkhorton

@alex-l-m do you have a sample CIF file with bonding information that we could incorporate into pymatgen as a test case?

mkhorton avatar Nov 16 '21 05:11 mkhorton

Hi, this would be a great feature to have! I am attaching an example cif with the bond info :)

# CIF file generated by openbabel 2.4.90, see http://openbabel.sf.net
data_I
_chemical_name_common 'str_m12_o9_o13_f0_bcu.sym.134'
_cell_length_a 15.0575
_cell_length_b 15.1814
_cell_length_c 16.8456
_cell_angle_alpha 108.912
_cell_angle_beta 105.512
_cell_angle_gamma 109.889
_space_group_name_H-M_alt 'P 1'
_space_group_name_Hall 'P 1'
loop_
    _symmetry_equiv_pos_as_xyz
    x,y,z
loop_
    _atom_site_label
    _atom_site_type_symbol
    _atom_site_fract_x
    _atom_site_fract_y
    _atom_site_fract_z
    _atom_site_occupancy
    C0      C    0.08427   0.29217   0.08263   1.000
    C1      C    0.91134   0.72437   0.96799   1.000
    C2      C    0.02437   0.58531   0.74592   1.000
    C3      C    0.96994   0.53523   0.27556   1.000
    C4      C    0.76020   0.19775   0.72483   1.000
    C5      C    0.33697   0.78219   0.25411   1.000
    C6      C    0.30651   0.47935   0.98438   1.000
    C7      C    0.63846   0.42759   0.93764   1.000
    C8      C    0.91112   0.93227   0.04701   1.000
    C9      C    0.09230   0.11281   0.10333   1.000
    C10     C    0.00420   0.92694   0.05744   1.000
    C11     C    0.99663   0.11391   0.07721   1.000
    C12     C    0.09743   0.02056   0.08989   1.000
    C13     C    0.90545   0.02428   0.05526   1.000
    C14     C    0.71802   0.95340   0.95291   1.000
    C15     C    0.28234   0.07865   0.18993   1.000
    C16     C    0.60326   0.99837   0.01510   1.000
    C17     C    0.40056   0.05270   0.12294   1.000
    C18     C    0.48628   0.56359   0.27606   1.000
    C19     C    0.50672   0.32366   0.64285   1.000
    C20     C    0.48091   0.54814   0.18811   1.000
    C21     C    0.49355   0.33012   0.72323   1.000
    C22     C    0.93468   0.52380   0.74922   1.000
    C23     C    0.06291   0.55049   0.26576   1.000
    C24     C    0.45126   0.44758   0.12025   1.000
    C25     C    0.47645   0.41134   0.77444   1.000
    C26     C    0.42986   0.36334   0.14243   1.000
    C27     C    0.47005   0.48446   0.74216   1.000
    C28     C    0.43482   0.37876   0.23018   1.000
    C29     C    0.48288   0.47782   0.66158   1.000
    C30     C    0.46059   0.47795   0.29728   1.000
    C31     C    0.50376   0.39938   0.61206   1.000
    C32     C    0.51643   0.40195   0.00422   1.000
    C33     C    0.39120   0.44291   0.88187   1.000
    C34     C    0.53471   0.40743   0.92849   1.000
    C35     C    0.38375   0.45256   0.96522   1.000
    C36     C    0.46754   0.42052   0.86238   1.000
    C37     C    0.44639   0.43058   0.02682   1.000
    C38     C    0.61880   0.94325   0.94027   1.000
    C39     C    0.38174   0.09070   0.20079   1.000
    C40     C    0.95912   0.58124   0.36227   1.000
    C41     C    0.02800   0.63229   0.68164   1.000
    C42     C    0.12933   0.71851   0.61392   1.000
    C43     C    0.85183   0.56103   0.44638   1.000
    C44     C    0.77698   0.20321   0.81194   1.000
    C45     C    0.28065   0.78425   0.17571   1.000
    C46     C    0.12238   0.70603   0.69081   1.000
    C47     C    0.86035   0.54880   0.36404   1.000
    C48     C    0.93616   0.59493   0.60393   1.000
    C49     C    0.04850   0.64882   0.44657   1.000
    C50     C    0.94195   0.60944   0.52789   1.000
    C51     C    0.04031   0.66253   0.52986   1.000
    C52     C    0.71958   0.10268   0.63794   1.000
    C53     C    0.44598   0.85643   0.32010   1.000
    C54     C    0.59148   0.03144   0.39124   1.000
    C55     C    0.61668   0.91305   0.55199   1.000
    C56     C    0.49966   0.94972   0.31608   1.000
    C57     C    0.68440   0.00306   0.63561   1.000
    C58     C    0.49278   0.84030   0.39432   1.000
    C59     C    0.70089   0.10975   0.55462   1.000
    C60     C    0.58186   0.92296   0.47121   1.000
    C61     C    0.62833   0.02099   0.47142   1.000
    C62     C    0.68705   0.06412   0.10257   1.000
    C63     C    0.32013   0.00309   0.03400   1.000
    C64     C    0.78709   0.07671   0.11504   1.000
    C65     C    0.22064   0.99138   0.02275   1.000
    C66     C    0.27673   0.55014   0.96545   1.000
    C67     C    0.69672   0.39596   0.98754   1.000
    C68     C    0.80302   0.02105   0.04027   1.000
    C69     C    0.20082   0.02771   0.10082   1.000
    C70     C    0.78225   0.29834   0.73432   1.000
    C71     C    0.27441   0.68998   0.25265   1.000
    C72     C    0.25528   0.44334   0.03211   1.000
    C73     C    0.70404   0.49802   0.91860   1.000
    C74     C    0.10642   0.60045   0.82024   1.000
    C75     C    0.89016   0.46115   0.18837   1.000
    C76     C    0.91242   0.20510   0.00246   1.000
    C77     C    0.08457   0.80538   0.05117   1.000
    C78     C    0.99613   0.20051   0.05639   1.000
    C79     C    0.99949   0.82293   0.02891   1.000
    H80     H    0.16125   0.32392   0.13610   1.000
    H81     H    0.83135   0.70787   0.93520   1.000
    H82     H    0.72878   0.90860   0.89487   1.000
    H83     H    0.26880   0.10912   0.25072   1.000
    H84     H    0.51190   0.64245   0.32739   1.000
    H85     H    0.51837   0.25909   0.60503   1.000
    H86     H    0.50034   0.61456   0.17296   1.000
    H87     H    0.49684   0.27154   0.74565   1.000
    H88     H    0.40844   0.28548   0.09140   1.000
    H89     H    0.45719   0.54803   0.78025   1.000
    H90     H    0.41773   0.31292   0.24605   1.000
    H91     H    0.47808   0.53524   0.63809   1.000
    H92     H    0.56743   0.39101   0.05493   1.000
    H93     H    0.33798   0.45340   0.83264   1.000
    H94     H    0.12510   0.68214   0.44778   1.000
    H95     H    0.86084   0.54460   0.59721   1.000
    H96     H    0.78846   0.50779   0.30271   1.000
    H97     H    0.19280   0.74752   0.75410   1.000
    H98     H    0.20491   0.76536   0.61846   1.000
    H99     H    0.77522   0.52547   0.44495   1.000
    H100    H    0.85622   0.50749   0.71060   1.000
    H101    H    0.14133   0.60185   0.31920   1.000
    H102    H    0.55371   0.89165   0.87274   1.000
    H103    H    0.44430   0.12993   0.26958   1.000
    H104    H    0.73209   0.18526   0.55455   1.000
    H105    H    0.45465   0.76821   0.39794   1.000
    H106    H    0.70169   0.99438   0.69833   1.000
    H107    H    0.46778   0.96307   0.25822   1.000
    H108    H    0.62762   0.10574   0.38984   1.000
    H109    H    0.58352   0.83769   0.55225   1.000
    H110    H    0.52616   0.98946   0.00531   1.000
    H111    H    0.47762   0.06221   0.13173   1.000
    H112    H    0.67448   0.10546   0.16074   1.000
    H113    H    0.33501   0.97423   0.97378   1.000
    H114    H    0.85133   0.12743   0.18321   1.000
    H115    H    0.15908   0.95383   0.95362   1.000
    H116    H    0.76785   0.14020   0.83056   1.000
    H117    H    0.31080   0.83562   0.14707   1.000
    H118    H    0.31233   0.60136   0.93937   1.000
    H119    H    0.66599   0.32839   0.99935   1.000
    H120    H    0.77781   0.32416   0.68147   1.000
    H121    H    0.29757   0.65422   0.29473   1.000
    H122    H    0.25723   0.38609   0.05621   1.000
    H123    H    0.68325   0.53284   0.87468   1.000
    H124    H    0.83075   0.15379   0.97779   1.000
    H125    H    0.16359   0.85952   0.10062   1.000
    H126    H    0.84092   0.86321   0.02877   1.000
    H127    H    0.16411   0.18090   0.12359   1.000
    N128    N    0.05566   0.34855   0.04562   1.000
    N129    N    0.94434   0.65145   0.95438   1.000
    N130    N    0.96083   0.50785   0.82370   1.000
    N131    N    0.03917   0.49215   0.17630   1.000
    N132    N    0.81129   0.30341   0.87165   1.000
    N133    N    0.18871   0.69659   0.12835   1.000
    N134    N    0.20520   0.55299   0.99767   1.000
    N135    N    0.79480   0.44701   0.00233   1.000
    N136    N    0.81480   0.36183   0.82370   1.000
    N137    N    0.18520   0.63817   0.17630   1.000
    N138    N    0.20168   0.49457   0.04562   1.000
    N139    N    0.79832   0.50543   0.95438   1.000
    N140    N    0.06842   0.56053   0.87165   1.000
    N141    N    0.93158   0.43947   0.12835   1.000
    N142    N    0.94807   0.29586   0.99767   1.000
    N143    N    0.05193   0.70414   0.00233   1.000
    Ni144   Ni   0.11934   0.49342   0.10996   1.000
    Ni145   Ni   0.88066   0.50658   0.89004   1.000
    Ni146   Ni   0.87259   0.37259   0.00000   1.000
    Ni147   Ni   0.12741   0.62741   0.00000   1.000
    O148    O    0.46722   0.48837   0.38403   1.000
    C149    C    0.42695   0.55494   0.42131   1.000
    H150    H    0.35373   0.54062   0.36841   1.000
    H151    H    0.48676   0.63880   0.45517   1.000
    H152    H    0.40769   0.53839   0.47575   1.000
    O153    O    0.51338   0.39669   0.53148   1.000
    C154    C    0.58545   0.36138   0.51562   1.000
    H155    H    0.54825   0.27392   0.48248   1.000
    H156    H    0.65753   0.39671   0.58014   1.000
    H157    H    0.60946   0.38620   0.46637   1.000
    O158    O    0.21125   0.65560   0.84420   1.000
    C159    C    0.23641   0.59272   0.78035   1.000
    H160    H    0.21386   0.51360   0.77769   1.000
    H161    H    0.19888   0.58148   0.70908   1.000
    H162    H    0.32203   0.63262   0.80288   1.000
    O163    O    0.78525   0.41314   0.16592   1.000
    C164    C    0.75499   0.30844   0.15067   1.000
    H165    H    0.79856   0.30542   0.21376   1.000
    H166    H    0.76534   0.26297   0.09023   1.000
    H167    H    0.67087   0.26987   0.13475   1.000
loop_
    _geom_bond_atom_site_label_1
    _geom_bond_atom_site_label_2
    _geom_bond_distance
    _geom_bond_site_symmetry_2
    _ccdc_geom_bond_type
    Ni146  N135      1.88315      .   S
    Ni146  N141      1.85224      .   S
    Ni146  N132      1.85224  1_554   S
    Ni146  N142      1.88287  1_554   S
    Ni147  N143      1.88287      .   S
    Ni147  N133      1.85224      .   S
    Ni147  N140      1.85224  1_554   S
    Ni147  N134      1.88315  1_554   S
    N135   N139      1.37737  1_554   S
    N135   C67       1.32029  1_554   S
    N143   C77       1.31556      .   S
    N143   N129      1.38433  1_454   S
    C76    C78       1.38112      .   S
    C76    H124      1.07554  1_554   S
    C76    N142      1.33612  1_554   S
    C32    C37       1.36569      .   S
    C32    H92       1.08388      .   S
    C32    C34       1.39695  1_554   S
    H110   C16       1.08156      .   S
    C16    C62       1.39809  1_565   S
    C16    C38       1.39680  1_554   S
    C65    C63       1.39891  1_565   S
    C65    C69       1.40520  1_565   S
    C65    H115      1.08314  1_554   S
    C37    C24       1.48695      .   S
    C37    C35       1.41173  1_554   S
    H126   C8        1.08367      .   S
    C79    C77       1.37575  1_655   S
    C79    C10       1.46394  1_655   S
    C79    C1        1.40665  1_554   S
    C72    N138      1.31109      .   S
    C72    H122      1.07732      .   S
    C72    C6        1.36747  1_554   S
    C63    C17       1.39865      .   S
    C63    H113      1.08230  1_544   S
    C68    C13       1.47560      .   S
    C68    C64       1.40279      .   S
    C68    C14       1.40612  1_544   S
    N128   C0        1.32672      .   S
    N128   Ni144     1.84262      .   S
    N128   N142      1.38433  1_454   S
    N138   Ni144     1.85163      .   S
    N138   N134      1.37737  1_554   S
    C8     C13       1.39272  1_565   S
    C8     C10       1.39866  1_655   S
    C77    H125      1.07050      .   S
    C13    C11       1.41583      .   S
    C78    C11       1.46819      .   S
    C78    C0        1.38697  1_655   S
    C10    C12       1.42044  1_565   S
    C11    C9        1.39683  1_655   S
    C0     H80       1.08042      .   S
    C12    C69       1.47923      .   S
    C12    C9        1.37691      .   S
    H166   C164      1.11344      .   S
    H88    C26       1.08348      .   S
    C69    C15       1.40501      .   S
    C62    C64       1.39941      .   S
    C62    H112      1.08225      .   S
    C9     H127      1.08056      .   S
    Ni144  N131      1.85100      .   S
    Ni144  N137      1.84227      .   S
    C64    H114      1.08315      .   S
    C24    C26       1.40458      .   S
    C24    C20       1.40304      .   S
    C17    H111      1.08142      .   S
    C17    C39       1.39726      .   S
    N133   C45       1.33846      .   S
    N133   N137      1.37738      .   S
    N141   N131      1.38433  1_655   S
    N141   C75       1.33446      .   S
    H167   C164      1.10953      .   S
    C26    C28       1.39661      .   S
    H117   C45       1.08049      .   S
    C164   O163      1.40714      .   S
    C164   H165      1.11406      .   S
    O163   C75       1.37402      .   S
    H86    C20       1.08328      .   S
    C45    C5        1.38205      .   S
    N131   C23       1.34710      .   S
    N137   C71       1.33357      .   S
    C20    C18       1.39805      .   S
    C75    C3        1.38984      .   S
    C15    C39       1.39730      .   S
    C15    H83       1.08311      .   S
    C39    H103      1.08257      .   S
    C28    H90       1.08237      .   S
    C28    C30       1.40002      .   S
    C71    C5        1.38413      .   S
    C71    H121      1.07931      .   S
    C5     C53       1.46735      .   S
    H107   C56       1.07955      .   S
    C23    C3        1.40297  1_455   S
    C23    H101      1.07827      .   S
    C3     C40       1.48026      .   S
    C18    C30       1.41448      .   S
    C18    H84       1.07988      .   S
    C30    O148      1.38993      .   S
    H96    C47       1.07378      .   S
    C56    C53       1.40577      .   S
    C56    C54       1.39002  1_565   S
    C53    C58       1.40167      .   S
    C40    C47       1.41083      .   S
    C40    C49       1.40510  1_655   S
    C47    C43       1.38538      .   S
    H150   C149      1.11478      .   S
    O148   C149      1.41160      .   S
    H108   C54       1.08409      .   S
    C54    C61       1.39843      .   S
    C58    H105      1.07960      .   S
    C58    C60       1.38690      .   S
    C149   H151      1.10986      .   S
    C149   H152      1.10922      .   S
    H99    C43       1.08440      .   S
    C43    C50       1.39193      .   S
    C49    H94       1.08184      .   S
    C49    C51       1.39556      .   S
    H157   C154      1.10951      .   S
    C60    C61       1.41109  1_565   S
    C60    C55       1.39773      .   S
    C61    C59       1.38883      .   S
    H155   C154      1.11119      .   S
    C154   O153      1.41224      .   S
    C154   H156      1.11448      .   S
    C50    C51       1.41012  1_655   S
    C50    C48       1.38802      .   S
    C51    C42       1.39770      .   S
    O153   C31       1.39235      .   S
    C55    H109      1.08356      .   S
    C55    C57       1.38824  1_565   S
    H104   C59       1.08136      .   S
    C59    C52       1.40280      .   S
    H95    C48       1.08093      .   S
    C48    C41       1.40159  1_655   S
    H85    C19       1.08143      .   S
    C31    C19       1.41477      .   S
    C31    C29       1.40139      .   S
    C42    H98       1.08366      .   S
    C42    C46       1.39383      .   S
    C57    C52       1.40538      .   S
    C57    H106      1.08058  1_545   S
    C52    C4        1.46648      .   S
    H91    C29       1.08282      .   S
    C19    C21       1.39758      .   S
    C29    C27       1.39834      .   S
    H120   C70       1.07828      .   S
    C41    C46       1.40871      .   S
    C41    C2        1.47988      .   S
    C46    H97       1.07860      .   S
    H161   C159      1.10839      .   S
    H100   C22       1.07965      .   S
    C21    H87       1.08326      .   S
    C21    C25       1.40573      .   S
    C4     C70       1.39149      .   S
    C4     C44       1.39025      .   S
    C70    N136      1.33400      .   S
    C27    C25       1.40464      .   S
    C27    H89       1.08299      .   S
    C2     C22       1.38375  1_455   S
    C2     C74       1.39267      .   S
    C22    N130      1.33623      .   S
    C25    C36       1.48915      .   S
    H160   C159      1.11339      .   S
    C159   H162      1.10950      .   S
    C159   O158      1.41770      .   S
    C44    H116      1.07733      .   S
    C44    N132      1.33718      .   S
    C74    O158      1.37721      .   S
    C74    N140      1.34683      .   S
    N136   N132      1.37738      .   S
    N136   Ni145     1.84227      .   S
    N130   N140      1.38433  1_655   S
    N130   Ni145     1.85100      .   S
    H93    C33       1.08274      .   S
    C36    C33       1.39283      .   S
    C36    C34       1.40721      .   S
    H102   C38       1.08247      .   S
    H123   C73       1.08029      .   S
    C33    C35       1.40259      .   S
    Ni145  N139      1.85163      .   S
    Ni145  N129      1.84262      .   S
    H82    C14       1.08332      .   S
    C73    C7        1.37650      .   S
    C73    N139      1.33457      .   S
    C34    C7        1.43985      .   S
    H81    C1        1.08051      .   S
    C7     C67       1.36396      .   S
    H118   C66       1.07185      .   S
    C38    C14       1.39748      .   S
    N129   C1        1.34007      .   S
    C35    C6        1.43524      .   S
    C66    C6        1.38970      .   S
    C66    N134      1.33637      .   S
    C67    H119      1.07920      .   S

kyonofx avatar Jun 04 '22 19:06 kyonofx