Generalize `download_opls_xml` Function
Summary
Include a summary of major changes in bullet points:
- Feature 1:
src/atomate2/openmmcontains utils.py, for which the LigParGen server is accessed; currently, thedownload_opla_xmlfunction allows only SMILES string inputs. This feature generalizes this by modifying the input dictionary (dict[str, str]) to a dictionary of dictionaries (dict[str, dict[str, str]]), where the new input handles the molecule's charge and number of optimization iterations. A working dictionary, then, follows:
mols = {
'benzene':{
'smiles':'c1ccccc1'
},
'TMA':{
'smiles':'C[N+](C)(C)C',
'checkopt':3,
'charge':"+1"
}
}
- Fix 1: The LigParGen server output creates an .XML file that results in an error while copying it to the new file, so
shutil.copy(file, final_file)resolves this issue.
Additional dependencies introduced (if any)
- shutil: code like
Path(file).rename(final_file)can fail for environments where the .XML file from LigParGen is created in a/tmpfolder.shutilallows a standard copy; it is possible that, in a high-throughput test case, this results in performance loss. - selenium.webdriver.support.ui.WebDriverWait and selenium.webdriver.support.expected_conditions: both are from selenium and safeguard from LigParGen server crashes
TODO (if any)
If this is a work-in-progress, write something about what else needs to be done.
- Feature 1 supports
utils.py, but has not been updated intests/openmm_md/test_utils.py.
Checklist
Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request title.
Before a pull request can be merged, the following items must be checked:
- [ ] Code is in the standard Python style.
The easiest way to handle this is to run the following in the correct sequence on
your local machine. Start with running
ruffandruff formaton your new code. This will automatically reformat your code to PEP8 conventions and fix many linting issues. - [ ] Doc strings have been added in the Numpy docstring format. Run ruff on your code.
- [ ] Type annotations are highly encouraged. Run mypy to type check your code.
- [ ] Tests have been added for any new functionality or bug fixes.
- [ ] All linting and tests pass.
Note that the CI system will run all the above checks. But it will be much more
efficient if you already fix most errors prior to submitting the PR. It is highly
recommended that you use the pre-commit hook provided in the repository. Simply run
pre-commit install and a check will be run prior to allowing commits.
Thanks @shehan807. This looks good to me. Can you install and run the linter on your changes to ensure they match the style guidelines: https://materialsproject.github.io/atomate2/dev/dev_install.html#installing-pre-commit
pip install pre-commit
pre-commit run --all
Hi @utf, I just wanted to follow up on this. It seems like the only issue is regarding the changes I've made using the time.sleep function--all I've done is added a dependence to the number of optimization iterations selected in LigParGen since this may increase the time it takes to output .xml/.pdb files.
On a slightly separate note, I wanted to raise the issue regarding version control of LigParGen. Based on my issue on the LigParGen repository (https://github.com/Isra3l/ligpargen/issues/31#issue-2639431963), I learned that the server has only kept up with BOSS v4.9 (the program managed by the Jorgenson group that operates under the hood for LigParGen). Perhaps @orionarcher could comment here, since in the case of high throughput OPLS-AA simulations, I wonder if extending the current download_opls_xml function to interface with the BOSS source code is a feasible option. The BOSS v5.1 source code is available online (https://zarbi.chem.yale.edu/software.html), but I am unsure how licensing works with incorporating it into atomate2. I'm happy to contribute to whatever extent this may be useful!
One comment but LGTM, thanks @shehan807.
It should be fine to build an integration that works if BOSS is available as an executable. Atomate2 has integrations with other closed-source codes like VASP.
The changes in this PR are now merged with #1111