atomate2 icon indicating copy to clipboard operation
atomate2 copied to clipboard

Generalize `download_opls_xml` Function

Open shehan807 opened this issue 1 year ago • 4 comments

Summary

Include a summary of major changes in bullet points:

  • Feature 1: src/atomate2/openmm contains utils.py, for which the LigParGen server is accessed; currently, the download_opla_xml function allows only SMILES string inputs. This feature generalizes this by modifying the input dictionary (dict[str, str]) to a dictionary of dictionaries (dict[str, dict[str, str]]), where the new input handles the molecule's charge and number of optimization iterations. A working dictionary, then, follows:
mols = {
        'benzene':{
            'smiles':'c1ccccc1'
            },
        'TMA':{
            'smiles':'C[N+](C)(C)C',
            'checkopt':3,
            'charge':"+1"
            }
        }
  • Fix 1: The LigParGen server output creates an .XML file that results in an error while copying it to the new file, so shutil.copy(file, final_file) resolves this issue.

Additional dependencies introduced (if any)

  • shutil: code like Path(file).rename(final_file) can fail for environments where the .XML file from LigParGen is created in a /tmp folder. shutil allows a standard copy; it is possible that, in a high-throughput test case, this results in performance loss.
  • selenium.webdriver.support.ui.WebDriverWait and selenium.webdriver.support.expected_conditions: both are from selenium and safeguard from LigParGen server crashes

TODO (if any)

If this is a work-in-progress, write something about what else needs to be done.

  • Feature 1 supports utils.py, but has not been updated in tests/openmm_md/test_utils.py.

Checklist

Work-in-progress pull requests are encouraged, but please put [WIP] in the pull request title.

Before a pull request can be merged, the following items must be checked:

  • [ ] Code is in the standard Python style. The easiest way to handle this is to run the following in the correct sequence on your local machine. Start with running ruff and ruff format on your new code. This will automatically reformat your code to PEP8 conventions and fix many linting issues.
  • [ ] Doc strings have been added in the Numpy docstring format. Run ruff on your code.
  • [ ] Type annotations are highly encouraged. Run mypy to type check your code.
  • [ ] Tests have been added for any new functionality or bug fixes.
  • [ ] All linting and tests pass.

Note that the CI system will run all the above checks. But it will be much more efficient if you already fix most errors prior to submitting the PR. It is highly recommended that you use the pre-commit hook provided in the repository. Simply run pre-commit install and a check will be run prior to allowing commits.

shehan807 avatar Nov 12 '24 00:11 shehan807

Thanks @shehan807. This looks good to me. Can you install and run the linter on your changes to ensure they match the style guidelines: https://materialsproject.github.io/atomate2/dev/dev_install.html#installing-pre-commit

pip install pre-commit
pre-commit run --all

utf avatar Nov 12 '24 15:11 utf

Hi @utf, I just wanted to follow up on this. It seems like the only issue is regarding the changes I've made using the time.sleep function--all I've done is added a dependence to the number of optimization iterations selected in LigParGen since this may increase the time it takes to output .xml/.pdb files.

On a slightly separate note, I wanted to raise the issue regarding version control of LigParGen. Based on my issue on the LigParGen repository (https://github.com/Isra3l/ligpargen/issues/31#issue-2639431963), I learned that the server has only kept up with BOSS v4.9 (the program managed by the Jorgenson group that operates under the hood for LigParGen). Perhaps @orionarcher could comment here, since in the case of high throughput OPLS-AA simulations, I wonder if extending the current download_opls_xml function to interface with the BOSS source code is a feasible option. The BOSS v5.1 source code is available online (https://zarbi.chem.yale.edu/software.html), but I am unsure how licensing works with incorporating it into atomate2. I'm happy to contribute to whatever extent this may be useful!

shehan807 avatar Nov 25 '24 16:11 shehan807

One comment but LGTM, thanks @shehan807.

It should be fine to build an integration that works if BOSS is available as an executable. Atomate2 has integrations with other closed-source codes like VASP.

orionarcher avatar Nov 25 '24 18:11 orionarcher

The changes in this PR are now merged with #1111

shehan807 avatar Jan 25 '25 18:01 shehan807