GLaDOS icon indicating copy to clipboard operation
GLaDOS copied to clipboard

Extracting 3D structure for ChEMBL compounds

Open jasperhyp opened this issue 2 years ago • 0 comments

Hi! I am interested in assigning 3D coordinates to (atoms in) some 10K compounds I have originally from ChEMBL. This is because as have been shown by many chemoinformatics papers, 3D and 2D structures together can better represent a molecule. This is also emphasized in large-scale molecular representation learning challenges, where 3D structures are estimated from DFT.

However, it seems time-costly to compute DFT with publicly available libraries (e.g. pyscf), or maybe I am missing something. And the faster approach used by RDKit or OpenBabel (force field-based, e.g. MMFF, UFF) generates less accurate 3D coordinates. I noticed that other databases like DrugBank already provides 3D coords for drugs -- are there any downloadable dumps of 3D structures of all (available) ChEMBL compounds?

jasperhyp avatar Apr 30 '23 21:04 jasperhyp