fairchem icon indicating copy to clipboard operation
fairchem copied to clipboard

Adaptation of my research to UMA?

Open brunosamp4 opened this issue 10 months ago • 5 comments

What would you like to report?

In my research, i was using the eq2_153M_ec4_allmd.pt model to predict the adsorption energy of some small molecules on a slab of Ni(111). My idea was to calculate it using the equiformer model and the fine tune the model for my specific models, reducing the MAE. However, as i understand, this model is not working anymore of Fairchem_v2. Can i do the same thing with UMA? And if so, i should just use the "uma-sm" checkpoint?

I asked the same question on the huggingface forum, but i can't see it anymore, so i thought maybe it'd be more appropriate to ask this in here.

brunosamp4 avatar May 28 '25 12:05 brunosamp4

Yup sorry about that, we closed the forum so we can consolidate everything here.

Correct, the older models are no longer supported in v2. Fortunately uma-sm is a plug in replacement here. See the example below:

from ase.build import fcc100, add_adsorbate, molecule
from ase.optimize import LBFGS
from fairchem.core import pretrained_mlip, FAIRChemCalculator

predictor = pretrained_mlip.get_predict_unit("uma-sm", device="cuda")
calc = FAIRChemCalculator(predictor, task_name="oc20")

# Set up your system as an ASE atoms object
slab = fcc100("Cu", (3, 3, 3), vacuum=8, periodic=True)
adsorbate = molecule("CO")
add_adsorbate(slab, adsorbate, 2.0, "bridge")

slab.calc = calc

# Set up LBFGS dynamics object
opt = LBFGS(slab)
opt.run(0.05, 100)

The one difference here though is the energy returned from this is a total energy, so if you want to compute adsorption energy you should use the same code and run a bare slab relaxation as well. Hope this helps!

mshuaibii avatar May 28 '25 14:05 mshuaibii

@mshuaibii Ah, i see. So, i don't need to calculate the energy of the isolated adsorbate as well?

Regarding my original question, i have just two more questions:

  1. In the fairchem_v1, the energy was already the adsorption energy, within a reference reaction, which was:

x CO + (x + y/2 - z) H2 + (z-x) H2O + w/2 N2 + * -> CxHyOzNw*

As i understand, in fairchem_v2, using UMA and the task_name = oc20, i don't have to consider the reference reaction anymore. I just need to do E(Adsorbate + Slab) - E(Slab). Is that correct?

  1. In my original research, i was fine tuning my model, using the eq2_153M_ec4_allmd.pt checkpoint to generate a config.yml. And with that config.yml file, i used the 'main.py' function to run the finetuning on the database i created (which i splited in train.db, val.db and test.db). In fairchem_v2, is it the same procedure, but now using uma_sm.pt? Is the procedure of finetuning a model to a certain database different now?

Thank you!

brunosamp4 avatar May 28 '25 16:05 brunosamp4

Sorry you still definitely need to include some energy corresponding to the adsorbate reference. If you want to be consistent with previous models and the OC20 literature you can get those directly from Table 5 in the OC20 paper:

Image

So it would be E_ML(adsorbate+slab) - E_ML(Slab) - gas_ref(Adsorbate). For example, E(H2O) = (2*-3.477 -7.204) = 14.158eV

mshuaibii avatar May 29 '25 15:05 mshuaibii

@mshuaibii Ok, so, i am running the calculation with

from ase.build import fcc100, add_adsorbate, molecule from ase.optimize import LBFGS from ase.io import read, write from fairchem.core import pretrained_mlip, FAIRChemCalculator

predictor = pretrained_mlip.get_predict_unit("uma-sm", device="cpu") calc = FAIRChemCalculator(predictor, task_name="oc20")

slab = read('pt-c2h6.xyz')

slab.calc = calc

opt = LBFGS(slab) opt.run(0.05, 100)

write('pt-c2h6-free-2.json', slab, format='json')

And then i ran the same calculation, but only with the bare slab. I get these two values, and, for the adsorbate, i use the values you mentioned. So, for C2H6, i'd use (2*(-7.282) + 6*(-3.477)), right?

Would that be consistent with the UMA model?

brunosamp4 avatar May 29 '25 16:05 brunosamp4

Yup exactly!

mshuaibii avatar May 30 '25 00:05 mshuaibii

@mshuaibii ok, thank you. I think the my Eads result still needs improvement, but i'll try estimating the energy with AdsorbML, maybe. I saw that there is now a "uma-s-1" model. Is it a improvement on "uma-sm"?

brunosamp4 avatar Jun 02 '25 16:06 brunosamp4

uma-s-1 is the same as uma-sm, we just renamed to introduce versioning. Can you share an example script of your use case with the results + what you expect? How big is the error you are seeing? Happy to spot check to make sure nothing is missing or off.

mshuaibii avatar Jun 02 '25 18:06 mshuaibii

Thank you @mshuaibii ! Here is the script of the Pt + adsorbate system:

from ase.build import fcc100, add_adsorbate, molecule
from ase.optimize import LBFGS
from ase.constraints import FixAtoms
from ase.io import read, write
from fairchem.core import pretrained_mlip, FAIRChemCalculator

predictor = pretrained_mlip.get_predict_unit("uma-sm", device="cpu")
calc = FAIRChemCalculator(predictor, task_name="oc20")

slab = read('pt-c2h6.xyz')
slab.calc = calc
constraint = FixAtoms(indices=range(18))
slab.set_constraint(constraint)
opt = LBFGS(slab)
opt.run(0.05, 100)
write('pt-c2h6-2lay.json',  slab, format='json')

And here is the bare slab calculation


from ase.build import fcc100, add_adsorbate, molecule
from ase.constraints import FixAtoms
from ase.optimize import LBFGS
from ase.io import read, write
from fairchem.core import pretrained_mlip, FAIRChemCalculator
predictor = pretrained_mlip.get_predict_unit("uma-sm", device="cpu")
calc = FAIRChemCalculator(predictor, task_name="oc20")
slab = read('4ml-9ptcluster.xyz')
slab.calc = calc
constraint = FixAtoms(indices=range(9))
slab.set_constraint(constraint)
opt = LBFGS(slab)
opt.run(0.05, 100)
write('pt-9pt-2lay.json',  slab, format='json')

The energy of the first one is -227.174212 eV and the bare cluster is -187.158260 eV. With the value of E(C2H6) = -35.4260 eV of the isolated adsorbate, the adsorption energy is of about -4 eV, but the experimental Eads is -0.2819 ev. Am i missing something on the python inputs? BFGS optimizer gives me the same result. Thank you again for helping me.

brunosamp4 avatar Jun 02 '25 18:06 brunosamp4

Here are the xyz files: pt-c2h6.xyz

44
Lattice="8.394416784003516 0.0 0.0 4.197208392001758 7.269778184901512 0.0 0.0 0.0 47.23325771917803" Properties=species:S:1:pos:R:3:tags:I:1 pbc="T T T"
Pt       0.00000000       0.00000000      20.00000000        4
Pt       2.77185858       0.00000000      20.00000000        4
Pt       5.54371716       0.00000000      20.00000000        4
Pt       1.38592929       2.40049995      20.00000000        4
Pt       4.15778787       2.40049995      20.00000000        4
Pt       6.92964646       2.40049995      20.00000000        4
Pt       2.77185858       4.80099990      20.00000000        4
Pt       5.54371716       4.80099990      20.00000000        4
Pt       8.31557575       4.80099990      20.00000000        4
Pt       1.38592929       0.80016665      22.26321306        3
Pt       4.15778787       0.80016665      22.26321306        3
Pt       6.92964646       0.80016665      22.26321306        3
Pt       2.77185858       3.20066660      22.26321306        3
Pt       5.54371716       3.20066660      22.26321306        3
Pt       8.31557575       3.20066660      22.26321306        3
Pt       4.15778787       5.60116655      22.26321306        3
Pt       6.92964646       5.60116655      22.26321306        3
Pt       9.70150504       5.60116655      22.26321306        3
Pt      -0.00000000       1.60033330      24.52642611        2
Pt       2.77185858       1.60033330      24.52642611        2
Pt       5.54371716       1.60033330      24.52642611        2
Pt       1.38592929       4.00083325      24.52642611        2
Pt       4.15778787       4.00083325      24.52642611        2
Pt       6.92964646       4.00083325      24.52642611        2
Pt       2.77185858       6.40133319      24.52642611        2
Pt       5.54371716       6.40133319      24.52642611        2
Pt       8.31557575       6.40133319      24.52642611        2
Pt       0.00000000       0.00000000      26.78963917        1
Pt       2.77185858       0.00000000      26.78963917        1
Pt       5.54371716       0.00000000      26.78963917        1
Pt       1.38592929       2.40049995      26.78963917        1
Pt       4.15778787       2.40049995      26.78963917        1
Pt       6.92964646       2.40049995      26.78963917        1
Pt       2.77185858       4.80099990      26.78963917        1
Pt       5.54371716       4.80099990      26.78963917        1
Pt       8.31557575       4.80099990      26.78963917        1
H        6.74122348       4.12098752      30.40968098        0
H        6.96865908       3.17148090      28.92735405        0
H        6.15545920       4.74331029      28.85021347        0
H        4.44489025       2.88637523      28.71238163        0
H        4.21066085       3.78975603      30.21710220        0
C        4.92324334       3.14528633      29.68489244        0
C        6.26123108       3.83287284      29.46315005        0
H        5.02667482       2.20430380      30.24188399        0

and 4ml-9ptcluster.xyz

36
Lattice="8.394417164447733 0.0 0.0 4.197208582223866 7.2697785143758695 0.0 0.0 0.0 47.23325985984131" Properties=species:S:1:pos:R:3:tags:I:1:forces:R:3 energy=-0.5338616371154785 pbc="T T T"
Pt       0.00000000       0.00000000      20.00000000        4      -0.31235087      -0.16667318      -0.23576574
Pt       2.77185858       0.00000000      20.00000000        4      -0.04428514      -0.27879435      -0.42013121
Pt       5.54371716       0.00000000      20.00000000        4       0.17418820      -0.36001483      -0.35575607
Pt       1.38592929       2.40049995      20.00000000        4      -0.27495486       0.11659463      -0.43486473
Pt       4.15778787       2.40049995      20.00000000        4      -0.00221496       0.00802770      -0.62820607
Pt       6.92964646       2.40049995      20.00000000        4       0.23018552      -0.08997133      -0.55403936
Pt       2.77185858       4.80099990      20.00000000        4      -0.22530563       0.33798230      -0.34853876
Pt       5.54371716       4.80099990      20.00000000        4       0.03447652       0.24354135      -0.52992260
Pt       8.31557575       4.80099990      20.00000000        4       0.26745248       0.15026639      -0.46366435
Pt       1.38592929       0.80016665      22.26321306        3      -0.33770394      -0.26172283       0.28692761
Pt       4.15778787       0.80016665      22.26321306        3      -0.02599625      -0.40194958       0.34261608
Pt       6.92964646       0.80016665      22.26321306        3       0.34787211      -0.48480666       0.29408494
Pt       2.77185858       3.20066660      22.26321306        3      -0.29877108       0.11824526       0.20432311
Pt       5.54371716       3.20066660      22.26321306        3       0.01767213      -0.01681863       0.25907558
Pt       8.31557575       3.20066660      22.26321306        3       0.40290275      -0.13287738       0.21112162
Pt       4.15778787       5.60116655      22.26321306        3      -0.25182065       0.48609477       0.05739736
Pt       6.92964646       5.60116655      22.26321306        3       0.04976374       0.37698632       0.10724723
Pt       9.70150504       5.60116655      22.26321306        3       0.42119065       0.25606629       0.06632011
Pt      -0.00000000       1.60033330      24.52642611        2      -0.43969834      -0.19299160      -0.23514140
Pt       2.77185858       1.60033330      24.52642611        2      -0.05670149      -0.30899432      -0.27734968
Pt       5.54371716       1.60033330      24.52642611        2       0.24180123      -0.42035222      -0.22310247
Pt       1.38592929       4.00083325      24.52642611        2      -0.39924374       0.14384851      -0.17450312
Pt       4.15778787       4.00083325      24.52642611        2      -0.01883423       0.03337597      -0.21485627
Pt       6.92964646       4.00083325      24.52642611        2       0.29727450      -0.10432465      -0.16424841
Pt       2.77185858       6.40133319      24.52642611        2      -0.29030764       0.46974319      -0.16933322
Pt       5.54371716       6.40133319      24.52642611        2       0.07420910       0.38521412      -0.20935112
Pt       8.31557575       6.40133319      24.52642611        2       0.38585782       0.24765463      -0.15989113
Pt       0.00000000       0.00000000      26.78963917        1      -0.30347216      -0.14542174       0.34515211
Pt       2.77185858       0.00000000      26.78963917        1      -0.07683324      -0.23726018       0.40833247
Pt       5.54371716       0.00000000      26.78963917        1       0.18301935      -0.33111823       0.23034151
Pt       1.38592929       2.40049995      26.78963917        1      -0.22559471       0.06862979       0.54017049
Pt       4.15778787       2.40049995      26.78963917        1       0.00387780      -0.02380829       0.61236179
Pt       6.92964646       2.40049995      26.78963917        1       0.27744064      -0.13308601       0.42513713
Pt       2.77185858       4.80099990      26.78963917        1      -0.17265809       0.30443525       0.48137310
Pt       5.54371716       4.80099990      26.78963917        1       0.04166083       0.22511321       0.55236053
Pt       8.31557575       4.80099990      26.78963917        1       0.31119263       0.11680170       0.36831415

brunosamp4 avatar Jun 03 '25 12:06 brunosamp4

Ah you're comparing to experiments, that's always one layer more complex.

For starters, the adsorption energy is a global minima problem so the way you have it setup here is only evaluating 1 configuration which may not actually correspond to the global minima. You can try AdsorbML - https://github.com/facebookresearch/fairchem/blob/main/src/fairchem/applications/AdsorbML/tutorials/adsorbml_walkthrough.ipynb. This pipeline evaluates the adsorption energy by enumerating many configurations to find the minima. We generally do N=100 placements.

mshuaibii avatar Jun 03 '25 12:06 mshuaibii

@mshuaibii That's true, i agree. Thank you! That was very helpful

brunosamp4 avatar Jun 03 '25 12:06 brunosamp4

@brunosamp4 The experimental energy you mention is a small negative number. My guess is you mean the adsorption energy of C2H6 relative to gas phase C2H6 (and the small negative number means there's a weak attraction).

The -4 eV you're seeing is very large, and much more negative than the experimental one. @mshuaibii is 100% correct that you should find the best adsorption site on the surface, but it can only make the number lower by finding a better site, which will make your problem worse.

I think the issue you have is that you need to consider the gas phase reference that Muhammed mentioned, which predicts the energy relative to a linear combination of CO, H2O, and H2, not C2H6(g). So, you need to subtract the reaction energy of 2CO + 5H2 → C2H6 + 2H2O, which happens to be -3.59 eV or so based on experimental thermochemistry). That gets you quite close to the experimental value!

Please follow this tutorial to make sure you understand the correction and can arrive at the -3.59 eV yourself (and verify that I didn't make any mistakes!) https://fair-chem.github.io/catalysts/examples_tutorials/OCP-introduction.html

zulissimeta avatar Jun 03 '25 14:06 zulissimeta

@zulissimeta Oh i see, i didn't realize that we should use the reference reaction here as well. Thank you very much! Now i'm getting good results.

brunosamp4 avatar Jun 03 '25 18:06 brunosamp4