Py_xDH
Py_xDH copied to clipboard
Too expensive computational cost for the large basis set
Hello!
I'm trying to use DerivOnce module but found its computational cost grows extremely large w.r.t. the basis set size. In the case of NaCl (2.5 Ang.), the computational cost for aug-cc-pvtz basis set is quite reasonable but, when I use the aug-cc-pvqz, it never ends.. Any tips or ideas for this? (Seems like, more than 120 basis set are suffered from this problem.) Below is what I tried.
Thank you very much!
from pyscf import gto, dft, lib from pyxdh.DerivOnce import GradXDH, DipoleXDH, DipoleMP2 import numpy as np np.set_printoptions(7, suppress=True, linewidth=120)
lib.num_threads(4)
mol = gto.M( verbose = 4, atom = ''' na 0 0 0 cl 0 0 2.5 ''', charge = 0, spin = 0, basis = 'aug-cc-pvqz', max_memory = 10000, ) mol.build()
grids = dft.Grids(mol) grids.atom_grid = (75, 302)
scf_eng = dft.RKS(mol) scf_eng.xc = 'b3lypg' scf_eng.grids = grids
nc_eng = dft.RKS(mol) nc_eng.xc = '0.8033HF - 0.0140LDA + 0.2107B88, 0.6789LYP' nc_eng.grids = grids config = { 'scf_eng': scf_eng, 'nc_eng': nc_eng, 'cc' : 0.3211, 'cphf_tol': 1e-12 }
dip = DipoleXDH(config) print (dip.E_1*2.541765)
This problem could be resolved by using this statement in bash then run python script:
$ export MAXMEM=6
Value larger than 6 (GB) should be also okay. This value should be set to even larger if using more basis functions.
Following is some explanation.
Any value smaller than 6 (GB) could make program stuck at ERI AO->MO transformation at O(N^8) complexity (instead of O(N^5)), since storing ERI AO or ERI MO tensor could consume up 5.94 GB memory if aug-cc-pVQZ:
https://github.com/ajz34/Py_xDH/blob/bd4940c30f195a566bd7f054f3143c5461840c8b/pyxdh/DerivOnce/deriv_once_scf.py#L515
Although this environment is intended in pyxdh, actually it introduces some inconvenience ...
It defines how much memory space left for np.einsum
to store intermediate tensor:
https://github.com/ajz34/Py_xDH/blob/bd4940c30f195a566bd7f054f3143c5461840c8b/pyxdh/DerivOnce/deriv_once_scf.py#L15
If memory is set to be too small, np.einsum
just make naive tensor contraction without parallel.
I have to say that this program is not optimized for CPU or memory efficiency, especially memory :disappointed_relieved: Hope you are not running out of memory if you make even larger-scale calculation.
Thank you very much for the kind and detailed explanation! Now it runs well :D