ANI1x_datasets icon indicating copy to clipboard operation
ANI1x_datasets copied to clipboard

Missing Quadrupole Constants?

Open meschw04 opened this issue 1 year ago • 0 comments

Hello!

Thank you so much for making the ANI-1x dataset available, it is a fantastic resource. I have a question regarding the availability of quadrupoles for molecules/conformers in the dataset. According to the paper, the 'wb97x_dz.quadrupole' key should contain an array of size $N_c \times 6$ where $N_c$ is the number of conformers per molecule. When I look at this array, a significant number of rows were full of nan. I ran the following code snippet:

ani1x_data = h5py.File('ani-1x/ani1x-release.h5')
frac_quads_li = []
for i in ani1x_data.keys():
    all_quads = ani1x_data[i]['wb97x_dz.quadrupole']
    all_quads_sub = np.unique(np.argwhere(~np.isnan(all_quads))[:,0])
    frac_quads_li.append(float(len(all_quads_sub))/len(all_quads))
print(f'Avg Fraction Computed Quads: {round(np.average(frac_quads_li),3)}')
print(f'No Quad Count: {np.sum(np.array(frac_quads_li)==0.0)}/{len(frac_quads_li)}')

...and got the following result:

Avg Fraction Computed Quads: 0.215
No Quad Count: 1698/3114

So it appears that there are quite a few quadrupoles that are all nan, with more than half of molecules having no quadrupole information. When I run the same analysis on 'wb97x_dz.dipole', I found that 181 molecules have no dipole constants available for any conformers. I did not find anything in the publication or GH repo that mentioned these nan values (although I may have missed it). So I am just wondering what happened to the dipoles/quadrupoles in these cases, and whether there is a version of the ANI-1x dataset that contains these dipole/quadrupole values. If not, that is fine. Am happy to recalculate them, or omit the corresponding conformers for the analysis I am trying to do. But if a shareable version is available with these additional values I would appreciate it, as it would save me some time and compute.

Thank you for your time,

Marcus Schwarting

meschw04 avatar Dec 09 '23 16:12 meschw04