MLIP and Water Tutorial
This PR adds the MLIP/Water Tutorial. It is spilt into three parts
- TIP4P Water by @hidekb
- MLIP Setup and evaluation by @PythonFZ
- MLIP MD by @hidekb
For testing until merged, you can use
git clone https://github.com/PythonFZ/espresso.git
cd espresso
git checkout mlip-water
cd doc/tutorials/mlip-water/
# use your favorite way of installation, e.g. uv sync / pip install . or pre-installed
dvc pull
Check out this pull request on ![]()
See visual diffs & provide feedback on Jupyter Notebooks.
Powered by ReviewNB
Thank you! It looks very good.
A few points from an initial quick read:
Part 1
- considr using GPU P3M, if a GPu is available.
- maybe extend description of how the rigid body with Virtual Sites works
Part 2
- For newcomers, some of the terminology might not be fully clear, e.g., hyperparameter, descriptor...
- Maybe it can be clarified what "true energy" erfers to, i.e., make more explicit that this refers to the result from qm calcuations on which the model was trained
- Eventhough the models are pre-trained, can one show some samples from the training data visually?
- What is the role of zntrack?
Part 3
- Maybem we can make it clear that this script in essence works for other MLIPs, as long as they provide an ASE calculator
- Explain thaat ase.integrate() assignes to the particles' ext_force. This means that
- this attribute is overwritten
- but otherwise, all ESPResSo interactions can be used in combination with the ASe calculator
I now had a look at the first part, this is looking very impressive to me (see my comments above)!
I just did a slightly longer (100x) run to test the pressure in the simulation, at least with the current friction coefficient (see my comment above) it seems reasonable: $$313 \pm 45$$ bar, which seems very plausible to me...
I will look into parts 2 and 3 tomorrow.
In the ASE interface, we are setting the calculator in a loop.
for _ in tbar:
atoms = system.ase.get()
atoms.calc = box.calc
The .calc is a property (https://gitlab.com/ase/ase/-/blob/master/ase/atoms.py#L396) and there might be a performance consideration here.
Hi,
I had another look and all my points seem well adressed. I just pushed quite a few of typo corrections and didactic rephrasings. From my side this is finished and can be used as tutorial!
A few open points:
-
Concerning the CUDA support for P3M in the first parg suggested by @RudolfWeeber , the way it is implemented now always demands CUDA, so the code will fail without CUDA installed - I think this should be a switch checking if CUDA is available and then using it.
-
I don't understand the last paragraph in the third tutorial part,
Because our MLIP learns all interactions, it can happen that minimum bond distances are not obeyed, (...)
I guess you are refering to the data $$r_\text{max} = 2\text{\AA}?$$ Then I would simply conclude that such a small cutoff misses the fundamental interactions and therefore produces nonphysical results, leaving out the rest of the discussion.
this can be mitigated by including short-range two-body
why would you want to do that?
I can shorten this, but I think it includes some valuable information
Because our MLIP learns all interactions, it can happen that minimum bond distances are not obeyed, leading to unphysical behavior in the simulation that are reflected in the RDF plots.
I think this highlights again, that there are e.g. no springs between particles and if not enough information is given, the system can blow up. Maybe including another sentence to explicitly refer to the 2 A data might help?
Although this can be mitigated by including short-range two-body potential terms, fundamentally, if such behavior occurs, the model parameters or training data should be improved.
Including the Ziegler-Biersack-Littmark (ZBL) short-range correction potential is common practice in many MLIPs but for our model, we have seen that including these correction terms might not necessarily improve the model accuracy.
If this is information is to coarse and doesn't really benefit the model, I'll shorten the section as you've suggested?
If this is information is to coarse and doesn't really benefit the model, I'll shorten the section as you've suggested?
Yes, I think either you need to provide more explanations (like the one you provided above) or simplify the section...