spice-dataset
spice-dataset copied to clipboard
A collection of QM data for training potential functions
I noticed that with the default psi4 configuration, the analytical gradient might have large errors. See https://github.com/psi4/psi4/issues/3161. Then I looked at whether the SPICE dataset has such a problem. I...
SPICE 2 is close to finished, which means this is a good time to discuss possible additions for version 3. Here are some ideas to get things started. - **Nonbonded...
` python .\downloader.py Processing SPICE Solvated Amino Acids Single Points Dataset v1.1 Processing SPICE Dipeptides Single Points Dataset v1.2 Traceback (most recent call last): File "D:\dataset\spice-dataset\downloader\downloader.py", line 111, in recs...
As we continue to explore the best ways to generate data for future iterations of SPICE, it would be useful to apply a variety of dataset generation strategies to a...
This will eventually be a collection of molecules from Ligand Expo. I've written a draft of the script to identify molecules and variants we want to process. The current implementation...
A possible addition for a future version is to provide data for simulating metalloproteins. This would require conformations involving a metal atom in an appropriate protein-like environment. If we decide...
This PR will have the scripts to generate the pretraining dataset discussed in #64. So far I've implemented the dipeptides subset. Let me know if this looks good. @giadefa I'd...
I think there could be value in creating a separate dataset for pretraining. It would cover the same chemical space as the standard SPICE dataset, but have many more conformations...
I've started investigating whether it's possible to run stable MD simulations using models trained on SPICE. I'm opening this issue as a place to describe my results and discuss approaches....
The dataset file (https://github.com/openmm/spice-dataset/releases/download/1.0/SPICE.hdf5) doesn't contain the total molecular charge. This could be extracted parsing the SMILES, but it is inconvenient and adds additional burden on the users. The dataset...