RMG-database
RMG-database copied to clipboard
Add AEC/BAC for dlpno-ccsd(t)/def2svp//wb97xd/def2svp
This PR adds AEC and BAC for the following level of theory: sp: dlpno-ccsd(t)/def2svp def2svp/c NormalPNO using ORCA opt/freq: wb97xd/def2svp using Gaussian
For Petersson-type BACs Training RMSE/MAE before fitting: 18.08/13.86 kcal/mol Training RMSE/MAE after fitting: 6.60/3.05 kcal/mol 1_dlpnoccsd(t)_def2svp__wb97xd_def2svp_errors.pdf 1_dlpnoccsd(t)_def2svp__wb97xd_def2svp_correlation.pdf
For Melius-type BACs Training RMSE/MAE before fitting: 18.08/13.86 kcal/mol Training RMSE/MAE after fitting: 6.77/3.70 kcal/mol 2_dlpnoccsd(t)_def2svp__wb97xd_def2svp_errors.pdf 2_dlpnoccsd(t)_def2svp__wb97xd_def2svp_correlation.pdf
Thanks for this PR! Just to summarize what we discussed in person, your intuition seems sensible in thinking that these errors are too large and the somewhat bimodal distribution is unexpected. Results from other levels of theory would seem to support this hypothesis. For context, the BAC from just wb97xd3 in QChem were added in #459 and #563 gave these results:
Fitting Petersson-type BACs for LevelOfTheory(method='wb97xd3',basis='def2tzvp',software='qchem')...
RMSE/MAE before fitting: 6.79/5.12 kcal/mol
RMSE/MAE after fitting: 2.22/1.33 kcal/mol
Fitting Melius-type BACs for LevelOfTheory(method='wb97xd3',basis='def2tzvp',software='qchem')...
RMSE/MAE before fitting: 6.79/5.12 kcal/mol
RMSE/MAE after fitting: 2.22/1.43 kcal/mol
It seems surprising that wb97xd3 would be better than the composite approach with DLPNO which should be quite reliable. Similarly, I would expect DLPNO would be similar to these CCSD(T)-F12 results:
Fitting Petersson-type BACs for CompositeLevelOfTheory(freq=LevelOfTheory(method='wb97xd3',basis='def2tzvp',software='qchem'),energy=LevelOfTheory(method='ccsd(t)f12',basis='ccpvdzf12',software='molpro'))...
RMSE/MAE before fitting: 1.26/0.94 kcal/mol
RMSE/MAE after fitting: 0.83/0.52 kcal/mol
I believe I also ran the BAC fitting procedure using values that existed on the bacs_and_enthalpy
branch and presented the results on slide 26 from my group meeting presentation on 2022-04-06 link Although this used tight PNO, the data supports the broader hypothesis that it is unexpected for the values in this PR to be substantially worse so perhaps something odd is going on with these current BACs.
The next step is to probably do AEC and BACs for just wb97xd in Gaussian to get another baseline. It would seem confusing if doing a dlpno sp on those geometries would make the results worse.