Add `OpengridCalculation` for `open_grid.x` code
QE has an open_grid.x code to unfold an irreducible-Brillouin-zone calculation to a full kmesh, significantly reduces the computation if the following calculation requires a full kmesh. This PR adds a OpengridCalculation class to support the code.
To allow subsequent ProjwfcCalculation on top of the OpengridCalculation, I have to introduce a somewhat ugly input/output spec:
- a
structureinput which isn't used by theOpengridCalculation, but becauseProjwfcCalculationimplicitly requires the parent calculation having an inputstructure(so it can parse the atomic orbitals). - I modified the
ProjwfcParserto allow it to use outputkpointsif no inputkpointsin the parent calculation (projwfc should use the outputkpointswhich is the full kmesh).
Thanks @yunfeng. Before I review this, a quick couple of questions:
- Where exactly is the computation time won? If you pass an IBZ kpoints to a calculation, will QE try to do the unfolding itself but in a less effective way compared to
open_grid.x? Surely calculating a full mesh is more expensive compared to a reduced mesh? - The issue you describe of the
ProjwfcParseralready cropped up a long time before and is documented in #299 . Ideally, we should really just fix this, and then you also don't have to work around this with weird specs etc.
Thanks @sphuber,
Thanks @Yunfeng. Before I review this, a quick couple of questions:
- Where exactly is the computation time won? If you pass an IBZ kpoints to a calculation, will QE try to do the unfolding itself but in a less effective way compared to
open_grid.x? Surely calculating a full mesh is more expensive compared to a reduced mesh?
In essence, the open_grid.x just read the wavefunctions of IBZ kpoints, apply some symmetry operations to generate wavefunction at the remaining kpoints. For pw.x, we can do a nscf calculation on full kmesh with restarting from IBZ calculation, but that requires diagonalization of the Kohn Sham equations, and orthogonalization of KS orbitals, this is much slower than reading and rewriting wavefunctions. This difference is something like 30 mins v.s. 3 mins.
- The issue you describe of the
ProjwfcParseralready cropped up a long time before and is documented in ProjwfcParser uses input nodes up to 3 links back, and prevents easy testing #299 . Ideally, we should really just fix this, and then you also don't have to work around this with weird specs etc.
Yes I agree we should fix the ProjwfcParser, the question is how to fix it:
the ProjwfcParser requires a kpoints, a structure, and a output_parameters containing number_of_spin_components,
https://github.com/aiidateam/aiida-quantumespresso/blob/c49def7a238617439feb75afd0d0cf9fbb681613/aiida_quantumespresso/parsers/projwfc.py#L404
In this opengrid case, the ProjwfcParser should use the opengrid output kpoints, since it is different from its input kpoints, so we can add an optional input kpoints to ProjwfcCalculation. Similarly, we can add an optional input structure for ProjwfcCalculation. But for number_of_spin_components, adding an optional input parameters containing number_of_spin_components to ProjwfcCalculation, is a bit weird. Also the projwfc.x raw output does not contain spin info.
Or, in a totally different way, the XML file in the parent_folder contains the needed info to generate kpoints, structure, and number_of_spin_components, should we parse the XML in ProjwfcParser?
In essence, the
open_grid.xjust read the wavefunctions of IBZ kpoints, apply some symmetry operations to generate wavefunction at the remaining kpoints. Forpw.x, we can do a nscf calculation on full kmesh with restarting from IBZ calculation, but that requires diagonalization of the Kohn Sham equations, and orthogonalization of KS orbitals, this is much slower than reading and rewriting wavefunctions. This difference is something like 30 mins v.s. 3 mins.
Thanks a lot. So just to see if I get the use case here: after an NSCF performed on a set of kpoints in the IBZ, instead of doing a new NSCF restart on the full mesh before passing it to projwfc.x, we first go through open_grid.x (instead of pw.x) and then directly go to projwfc.x?
Yes I agree we should fix the
ProjwfcParser, the question is how to fix it: theProjwfcParserrequires akpoints, astructure, and aoutput_parameterscontainingnumber_of_spin_components,https://github.com/aiidateam/aiida-quantumespresso/blob/c49def7a238617439feb75afd0d0cf9fbb681613/aiida_quantumespresso/parsers/projwfc.py#L404
In this opengrid case, the
ProjwfcParsershould use the opengrid outputkpoints, since it is different from its inputkpoints, so we can add an optional inputkpointstoProjwfcCalculation. Similarly, we can add an optional inputstructureforProjwfcCalculation.
I am not sure that adding inputs to ProjwfcCalculation is the way to go. The whole problem with the ProjwfcParser that we have now is that it is not self-contained and reaches up the provenance graph to get information from preceding calculations. Passing in an output of the parent calculation as an input, is just another version of this same workaround.
But for
number_of_spin_components, adding an optional inputparameterscontainingnumber_of_spin_componentstoProjwfcCalculation, is a bit weird. Also theprojwfc.xraw output does not contain spin info. Or, in a totally different way, the XML file in theparent_foldercontains the needed info to generatekpoints,structure, andnumber_of_spin_components, should we parse the XML inProjwfcParser?
If the output of projwfc.x itself does not contain the necessary information, than we should indeed get it from the XML of the parent folder because this is exactly what projwfc.x bases itself on. This would also solve the problem of using a parent_folder that may come from different calcjobs. In this way, it should not matter for the ProjwfcParser and simply always parses the same XML. That is to say that QE doesn't do something incredibly weird and annoying and have pw.x and open_grid.x write different output formats... :sweat_smile:
Thanks @mbercx for fixing #722, #741, and #747!
Now I have removed those strange input spec from OpengridCalculation, and added some tests.
Finally, the whole Wannier workchain runs smoothly 😄
Thanks @qiaojunfeng . I propose we wait for @mbercx 's PRs to be finished and merged and then we rebase your PR. Then I will give it a final pass and we can merge it.
All right, just did some checks, including
- launch
OpenGridCalculationon top of theRemoteDataoutput folder of a scfPwCalculation - launch
ProjwfcCalculationon top of theRemoteDataoutput folder of theOpenGridCalculation
and all things work well.
One noticeable change: I use OpenGridCalculation instead of OpengridCalculation, and quantumespresso.open_grid instead of quantumespresso.opengrid for the entry point name, to be more consistent with the QE executable name open_grid.x, as @sphuber suggested one year ago. Although the capital case and the underscore look a bit unwieldy to me, I can tolerate this for the sake of consistency.
For the compatibility matrix, maybe we can add a footnote to the QE version, saying for open_grid.x the user needs to run QE>=7.0 instead of raising the lower bound of QE version?
Sure looks good to me! Thanks a lot to both. Just make sure to squash merge and add a nice commit message as per yuzh