aiida-common-workflows
aiida-common-workflows copied to clipboard
Implementation of a method that suggests some computational resources
This was an open question we raised already in February in Cineca, but didn't address so far.
In the original idea the calc_engines
dict was in charge to receive the computational resources required for each step of the relaxation. However, as far as I understood, this is not necessary for quantum espresso as they have a way to suggest resources. Could this become available for all the others?
In any case there are always cluster specific resources that a user needs to specify (queue, account, ...) therefor the possibility to pass these through calc_engines
must remain.
Yes, this is possible, but it is code specific (VASP and QE could probably share) so might not give the results you want. I think a recommendation for this needs to happen on a per-code basis. However, common code should be shared if possible.
Also what we see from the VASP side is that we need to have a resource algorithm that is problem dependent, not just on the functionality (DFT versus GW etc.), but also on a per-problem basis in the given functionality domain.
Maybe we could agree to specify the maximum allowed resources that a workflow is allowed to use on a single jobcalc. That's what we do, the code specific workchain determines within this bound what is optimal.
This is an interesting suggestion, I like it
How would this be specified? I imagine we need at least two fields: max_wallclock_seconds
and max_num_machines
. But this opens more questions maybe. This probably should be the maximum wallclock for the entire workchain, no matter how many jobs it requires. Should we instead maybe require max_node_hours
and max_num_machines
, meaning the total workchain can at most spend max_node_hours
and for any one job, it can at most use max_num_machines
.
In the case where there are different calc_engines
I'm not sure it always makes sense to have a global limit. The different codes could run on different node types (more or less "expensive") or even different computers.
I think that is exactly why we should have a global limit. A novice user might not know how much each step, reflected by the keys in the calc_engine
, requires in terms of resources, and there can be arbitrary many. I think it wouldn't be (novice) user friendly to have them specify max resources for each step. The only reasonable simple thing here, to me, seems to define a global maximum for the entire workchain.
P.S.: the closing was accidental
Yeah, indeed from that perspective it would be nicer to have a single limit. But that doesn't completely solve the problem if they still need to know which queue it should run on etc. for each step, right?
For me the "maximum number of nodes" typically is just a poor-man's proxy for "how well does this code scale across nodes". Of course this is far from optimal, because that depends quite strongly on problem size.
Maybe a more ambitious goal would be having some way of specifying "compute capabilities", which are taken into account in the code-specific automatic resource selection.
A related problem is how to select among different nodes for the same code. This could be different microarchitectures or (probably more common) GPU-accelerated vs. CPU-only nodes. These even need to be different codes in AiiDA (because it's a different binary), even though they have the same "logical" capability.
The only reasonable simple thing here, to me, seems to define a global maximum for the entire workchain.
This is easy to specify but how would you implement it? Then the code specific workchain needs to have reliable information of any run calcjob and add these up. You do not get this from the scheduler, the code has to have such output, which needs to be parsed. Then you also need to set the resources for each calcjob below that limit to ensure not already the first calcjob overshoots (same as you say the specified limit is just per calcjob). Personally I do not care how many resource the total workchain uses, I would not want it to abort at 95% because it projected the next calcjob takes to much. I would like to trust the specific code implementations to build in fail safes to not wast resources, i.e if the systems cannot be converged after so many tries stop. Another thing is what you do with caching, run again/continue with new resources, or say your done? For a max per calcjob this is not an issue.
For the limit two specification is fine, I would stay (close) to jobcalcs metadata.options, have max_wallclock_seconds and max_num_machines. Walltime
is easier to add up then node_hours
because walltime is independent of the underlying computer architecture and closer to humans.
related problem is how to select among different nodes for the same code.
with same you mean same entrypoint. This is done like in for calcjobs to give the workchain information on which queue it should submit to for the given codenode (label@computer), and a GUI should know if queue, executeable and computer name match.
having computer architecture related information within aiida is still a general open issue if I remember. There was some talk about super computering centers providing some standard information file on their machines which can then be read. Also for other things like runtime prediction of codes.
For me the "maximum number of nodes" typically is just a poor-man's proxy for "how well does this code scale across nodes". Of course this is far from optimal, because that depends quite strongly on problem size.
I think this is key and reflects not only nodes, but also walltime, memory etc. It would be difficult to suggest this. Ideally, each plugin developer have to give recommendations given the problem at hand. Which most likely varies between codes (even very similar codes like VASP and QE). Of course, we could supply a maximum, but that should in principle be asked of the scheduler etc. given the queue slot one wants to submit etc. If we take the relaxation problem, I can foresee that we for VASP would execute a few different jobs with different walltimes, memory and node distributions in order to arrive faster and/or more reliably at the end result. Specifying a maximum might ruin this. One could of course say that the plugins should obey the maximum that we receive. In general this is probably a good idea and it should eventually come from a query to the scheduler or local hardware and stored on the computer instance or similar. Possibly through an override.